NoShade.Vision

Cross-Sectional backtests in python

Chad-Thackray

Cross-Sectional backtests in python by Chad-Thackray

The video covers how to backtest cross-sectional strategies using pure Python without frameworks. They explain data gathering using Y Finance, pandas, and numpy, taking care to avoid any look-ahead bias and ensure consistency between the length of the indicator and data frames. The backtesting process involves iterating over every row in the data frame to program trading logic, selecting the K highest or lowest assets out of the available ones. Subsequent sections cover implementing a function to return indexes of the highest/lowest K values, delaying backtesting until a minimum number of assets are available, consolidating contiguous blocks of trades to calculate fees, and performing cross-sectional backtesting using a for Loop to eliminate a key weakness. The speaker also warns against potential errors with NaN values and survivorship bias and suggests optimizing with pure numpy functions and the just-in-time compiler.

00:00:00

In this section of the video, the presenter explains how to backtest cross-sectional strategies using pure Python without any frameworks. They demonstrate a simple strategy of selling at the open of the next bar and buying the top two assets with the highest RSI. The video walks through the data gathering portion of the script, using Y Finance to grab the daily data and pandas and numpy for data analysis. They explain how to calculate the indicator based on the close of the bar to avoid look-ahead bias and apply the RSI function to each column using a lambda function. The presenter mentions that this generic script can be extended to any universe of pairs, such as stocks, Forex, or crypto.

00:05:00

In this section, the speaker discusses the data gathering and indicator calculation process and how users can customize it based on their needs. They caution against any look-ahead bias and advise to ensure consistency in the length of indicator data frame with regular data. They then move on to explain the backtesting process, where a user would iterate over every row in the data frame to program the trading logic. The logic involves taking an indicator at the current bar to make trading decisions by comparing it to the next bar's open price, which is where the user would buy their asset. They also explain how users can select a certain number of assets out of the available ones and partition a list based on highest or lowest K values.

00:10:00

In this section, the YouTuber explains how to implement a function in Python that returns the indexes of the lowest and highest elements in a list given a certain number of holdings. The function can take a positive or negative integer and will return the corresponding indexes of the values either starting from the lowest or highest values in the list. The YouTuber warns us of potential errors with NaN values in the list and suggests replacing them with negative infinity to ensure the function works correctly. Additionally, they suggest adding a check for the number of assets available in the list to avoid picking more holdings than there are assets.

00:15:00

In this section, the presenter demonstrates how to delay the backtesting until a minimum number of tradable assets are available. They set a condition that the number of assets should be greater than or equal to twice the number of holdings to ensure that the strategy only runs when there are enough assets to choose from. They also introduce a new empty list called "trades" to record all virtual trades made during backtesting. This list will include the asset name, time of purchase, purchase price, sale price, and return. The presenter notes that it's possible to buy the same assets two bars in a row, but these trades can be treated as canceling out since they occur at the same price level.

00:20:00

In this section, the speaker explains how to convert a list of trades into a pandas data frame to easily see which assets were bought on which day during the backtesting process. This view is valuable for debugging purposes and quickly assessing how the strategy is performing without fees. The speaker also discusses how to calculate daily returns using this data frame and how to plot the equity curve for the backtest. However, the backtest doesn't take into account fees and has some other issues, which will be covered later.

00:25:00

section covers the importance of accounting for fees and survivorship bias when conducting cross-sectional backtesting in Python. By subtracting fees from daily returns, traders can more accurately estimate profits and losses. However, for larger universes of assets, it is important to consider survivorship bias and potential delistings when selecting assets to trade. Additionally, the video suggests implementing a check for contiguous blocks to prevent unrealistic trading scenarios and iterating through trades to ensure accuracy. While the video does not provide a complete coding walkthrough, it offers valuable insights for building a moderately accurate backtesting engine for cross-sectional strategies.

00:30:00

In this section, the speaker discusses their approach to consolidating contiguous blocks of trades into one trade, buying at a certain time and selling at another time. They suggest keeping track of the trades bought and assets held can be helpful for debugging but can get complicated quickly, so instead they suggest consolidating the trades after analyzing and reporting. They take an approach where they create a new list of trades called "new trades" and loop through the current trades, checking if a trade is open or closed. If a trade is not open, they open one by setting the buy price and time, then they go through a loop to find the end of the trade, close out the trade by setting the sell price and appending it to the new trades list. They then convert the new trades list into a dataframe for analysis.

00:35:00

In this section, the speaker explains how to perform cross-sectional backtesting by iterating over different assets within a for Loop. This allows for a more reliable calculation of fees and eliminates a main weakness of the backtesting engine. While the method can be slow, the speaker recommends optimizing with pure numpy functions and the just-in-time compiler. Overall, the speaker hopes viewers can use this knowledge to build and extend their own backtesting engines and to gain a better understanding of how backtesters work, even if they end up using someone else's framework.

More from
Chad-Thackray

No videos found.

Trending
AI Music

No music found.