Candlestick Plot

In this lecture, we'll be using another case study to learn about visualizing time-series data. We'll be particularly focusing on candlestick plot, which is a special bar chart commonly used in trading platform, foreign exchange, and weather forecasts.

Candlestick charts originated in Japan during the 1700s when Japanese traders Homma analyzed the price of trade pattern of rice contracts for huge profits. His research on so called price pattern recognition was widely credited and gave rise to the global trading in Japan.

Candlestick plot delivers important information for traders given a time period. It shows the opening price, closing price, highest trading price and lowest trading price of a particular commodity over a time period. The rectangular portion of the candle is called the Body. The lines above and below the Body are called Upper and Lower Shadow respectively. The Highest Trading Price is marked by the top of the Upper Shadow and the Lowest trading Price is marked by the bottom of the Lower Shadow.

Candlesticks can display various trading patterns like bullish, bearish and many more. In particular, the green candle represents the bullish pattern in which the trading price increased over a certain time period. The bottom of the candle body shows the opening price and the top of the candle body shows the closing price.

Conversely, the red candle shows the bearish pattern in which the price dropped over a period of time. In this case the top of the candle body shows the opening price and the bottom of the candle body shows the closing price.

In case you're curious, there is a great investopedia website which shows you a variery of trading patterns related to the candlestick plots.

There are many possible ways in Python to make candlestick plots, mpl_finance, plotly, finplot are among some of the most common libraries. We'll focus on using the mpl_finance option.

In [14]:
# Let's bring in pandas as pd as usual
import pandas as pd
# To get the Python to manipulate how datetime is formatted in the dataset, 
# we need to import the native datetime package
import datetime
# We also import matplotlib.pyplot as plt for making plots
import matplotlib.pyplot as plt
# and import date functionalities mdates
# https://matplotlib.org/3.3.2/api/dates_api.html
import matplotlib.dates as mdates

# Let's import the stock data stock pd.read_csv
# and let's parse the date column into datetime type and put it into index
stock_data = pd.read_csv('assets/stocks.csv', parse_dates = ['Date'], index_col = 1)
# and let's check it out.
stock_data.head()
Out[14]:
Ticker High Low Open Close Volume Adj Close
Date
2006-01-03 AAPL 10.678572 10.321428 10.340000 10.678572 201808600.0 9.319328
2006-01-04 AAPL 10.854285 10.642858 10.732857 10.710000 154900900.0 9.346760
2006-01-05 AAPL 10.700000 10.535714 10.690000 10.625714 112355600.0 9.273201
2006-01-06 AAPL 10.957143 10.650000 10.750000 10.900000 176114400.0 9.512576
2006-01-09 AAPL 11.028571 10.820000 10.961429 10.864285 168760200.0 9.481407
In [15]:
# Now there are a few companies included in the stock data, so it's a good practice to 
# keep a list of the company names to make candlestick plot for each company
company_list = stock_data['Ticker'].unique().tolist()

# We can run a for loop to look at the stock data more closely
# So for each company in company_list
for company in company_list:
    # we print out the stock data associated with each company
    print(stock_data.groupby('Ticker').get_group(company))

# So you'll see there are 2517 opening trading day from 2006 to 2015. For each trading day, 
# the stock price of AAPL, MSFT, IBM, GOOG and an index fund called GSPC are shown in 4 values, 
# namely High, Low, Open and Close. 
# This is really a cool data!
           Ticker        High         Low        Open       Close  \
Date                                                                
2006-01-03   AAPL   10.678572   10.321428   10.340000   10.678572   
2006-01-04   AAPL   10.854285   10.642858   10.732857   10.710000   
2006-01-05   AAPL   10.700000   10.535714   10.690000   10.625714   
2006-01-06   AAPL   10.957143   10.650000   10.750000   10.900000   
2006-01-09   AAPL   11.028571   10.820000   10.961429   10.864285   
...           ...         ...         ...         ...         ...   
2015-12-24   AAPL  109.000000  107.949997  109.000000  108.029999   
2015-12-28   AAPL  107.690002  106.180000  107.589996  106.820000   
2015-12-29   AAPL  109.430000  106.860001  106.959999  108.739998   
2015-12-30   AAPL  108.699997  107.180000  108.580002  107.320000   
2015-12-31   AAPL  107.029999  104.820000  107.010002  105.260002   

                 Volume   Adj Close  
Date                                 
2006-01-03  201808600.0    9.319328  
2006-01-04  154900900.0    9.346760  
2006-01-05  112355600.0    9.273201  
2006-01-06  176114400.0    9.512576  
2006-01-09  168760200.0    9.481407  
...                 ...         ...  
2015-12-24   13570400.0  101.254143  
2015-12-28   26704200.0  100.120033  
2015-12-29   30931200.0  101.919609  
2015-12-30   25213800.0  100.588676  
2015-12-31   40912300.0   98.657883  

[2517 rows x 7 columns]
           Ticker       High        Low       Open      Close       Volume  \
Date                                                                         
2006-01-03   MSFT  27.000000  26.100000  26.250000  26.840000   79973000.0   
2006-01-04   MSFT  27.080000  26.770000  26.770000  26.969999   57975600.0   
2006-01-05   MSFT  27.129999  26.910000  26.959999  26.990000   48245500.0   
2006-01-06   MSFT  27.000000  26.490000  26.889999  26.910000  100963000.0   
2006-01-09   MSFT  27.070000  26.760000  26.930000  26.860001   55625000.0   
...           ...        ...        ...        ...        ...          ...   
2015-12-24   MSFT  55.959999  55.430000  55.860001  55.669998    9558500.0   
2015-12-28   MSFT  55.950001  54.980000  55.349998  55.950001   22458300.0   
2015-12-29   MSFT  56.849998  56.060001  56.290001  56.549999   27731400.0   
2015-12-30   MSFT  56.779999  56.290001  56.470001  56.310001   21704500.0   
2015-12-31   MSFT  56.189999  55.419998  56.040001  55.480000   27334100.0   

            Adj Close  
Date                   
2006-01-03  19.777893  
2006-01-04  19.873684  
2006-01-05  19.888428  
2006-01-06  19.829477  
2006-01-09  19.792625  
...               ...  
2015-12-24  51.513504  
2015-12-28  51.772598  
2015-12-29  52.327797  
2015-12-30  52.105721  
2015-12-31  51.337688  

[2517 rows x 7 columns]
           Ticker        High         Low        Open       Close      Volume  \
Date                                                                            
2006-01-03    IBM   82.550003   80.809998   82.449997   82.059998  11715100.0   
2006-01-04    IBM   82.500000   81.330002   82.199997   81.949997   9832800.0   
2006-01-05    IBM   82.900002   81.250000   81.400002   82.500000   7213400.0   
2006-01-06    IBM   85.029999   83.410004   83.949997   84.949997   8196900.0   
2006-01-09    IBM   84.250000   83.379997   83.900002   83.730003   6851100.0   
...           ...         ...         ...         ...         ...         ...   
2015-12-24    IBM  138.880005  138.110001  138.429993  138.250000   1495200.0   
2015-12-28    IBM  138.039993  136.539993  137.740005  137.610001   3143400.0   
2015-12-29    IBM  140.059998  138.199997  138.250000  139.779999   3943700.0   
2015-12-30    IBM  140.440002  139.220001  139.580002  139.339996   2989400.0   
2015-12-31    IBM  139.100006  137.570007  139.070007  137.619995   3462100.0   

             Adj Close  
Date                    
2006-01-03   57.011086  
2006-01-04   56.934662  
2006-01-05   57.316772  
2006-01-06   59.018898  
2006-01-09   58.171333  
...                ...  
2015-12-24  117.297935  
2015-12-28  116.754913  
2015-12-29  118.596077  
2015-12-30  118.222755  
2015-12-31  116.763412  

[2517 rows x 7 columns]
           Ticker        High         Low        Open       Close      Volume  \
Date                                                                            
2006-01-03   GOOG  217.021545  208.329132  210.471100  216.802368  26340700.0   
2006-01-04   GOOG  223.641739  219.053925  221.121185  221.788681  30687300.0   
2006-01-05   GOOG  224.931900  219.925659  222.167267  224.777481  21697600.0   
2006-01-06   GOOG  234.371521  225.773743  227.581970  231.960556  35646900.0   
2006-01-09   GOOG  235.816101  229.609375  232.334152  232.578247  25679600.0   
...           ...         ...         ...         ...         ...         ...   
2015-12-24   GOOG  751.349976  746.619995  749.549988  748.400024    527200.0   
2015-12-28   GOOG  762.989990  749.520020  752.919983  762.510010   1515300.0   
2015-12-29   GOOG  779.979980  766.429993  766.690002  776.599976   1765000.0   
2015-12-30   GOOG  777.599976  766.900024  776.599976  771.000000   1293300.0   
2015-12-31   GOOG  769.500000  758.340027  769.500000  758.880005   1500900.0   

             Adj Close  
Date                    
2006-01-03  216.802368  
2006-01-04  221.788681  
2006-01-05  224.777481  
2006-01-06  231.960556  
2006-01-09  232.578247  
...                ...  
2015-12-24  748.400024  
2015-12-28  762.510010  
2015-12-29  776.599976  
2015-12-30  771.000000  
2015-12-31  758.880005  

[2517 rows x 7 columns]
           Ticker         High          Low         Open        Close  \
Date                                                                    
2006-01-03  ^GSPC  1270.219971  1245.739990  1248.290039  1268.800049   
2006-01-04  ^GSPC  1275.369995  1267.739990  1268.800049  1273.459961   
2006-01-05  ^GSPC  1276.910034  1270.300049  1273.459961  1273.479980   
2006-01-06  ^GSPC  1286.089966  1273.479980  1273.479980  1285.449951   
2006-01-09  ^GSPC  1290.780029  1284.819946  1285.449951  1290.150024   
...           ...          ...          ...          ...          ...   
2015-12-24  ^GSPC  2067.360107  2058.729980  2063.520020  2060.989990   
2015-12-28  ^GSPC  2057.770020  2044.199951  2057.770020  2056.500000   
2015-12-29  ^GSPC  2081.560059  2060.540039  2060.540039  2078.360107   
2015-12-30  ^GSPC  2077.340088  2061.969971  2077.340088  2063.360107   
2015-12-31  ^GSPC  2062.540039  2043.619995  2060.590088  2043.939941   

                  Volume    Adj Close  
Date                                   
2006-01-03  2.554570e+09  1268.800049  
2006-01-04  2.515330e+09  1273.459961  
2006-01-05  2.433340e+09  1273.479980  
2006-01-06  2.446560e+09  1285.449951  
2006-01-09  2.301490e+09  1290.150024  
...                  ...          ...  
2015-12-24  1.411860e+09  2060.989990  
2015-12-28  2.492510e+09  2056.500000  
2015-12-29  2.542000e+09  2078.360107  
2015-12-30  2.367430e+09  2063.360107  
2015-12-31  2.655330e+09  2043.939941  

[2517 rows x 7 columns]
In [16]:
# Let's suppose we are interested in understanding how did the stock price of Apple, 
# Microsoft, IBM, Google and GSPC fluctuate over the great recession during 2008 and 2010.
# Let's pause the video and think of how do we visualize the bullish and bearish trend for each of the 5 stocks? 

# So here's my solution. We could first subset the time period to between 
# 2008 and 2010 and overwrite the old data
stock_data = stock_data['2008':'2010']
# and then reset the index
stock_data.reset_index(inplace = True)
# We can pass through a mapper in the date column
# to convert date to a unique numeric identifier of datetime
stock_data['Date'] = stock_data['Date'].map(mdates.date2num)
In [17]:
# Cool! Next, let's import lines
# which is an artistic library of legend handles supported by matplotlib,
import matplotlib.lines as mlines
# and we are specifically using mpl_finance candlestick_ohlc function to make candlestick plots
from mpl_finance import candlestick_ohlc
In [59]:
# Let's set up the visualization by creating our fig objects using matplotlib's 
# plt.gcf() (get current figure) function as usual, 

fig = plt.gcf()
# and I'm gotta set the size of it in inches to 15 by 18
fig.set_size_inches(15, 18)

#Now let's iterate over the list of companies and create a side-by-side figure for each of the 5 companies. 
for i in range(len(company_list)):
    # So stock_group will save a copy of the stock data splitted by company
    stock_group = stock_data.groupby('Ticker').get_group(company_list[i])
    # and we'll make a new subplot for each company
    ax = fig.add_subplot(5, 1, i+1)
    
    # We should be noted that candlestick plot expects the date as index, company names as labels, 
    # and the ohlc data as values. To recap, ohlc is just the short hand of open, high, 
    # low and close, values that best represent time-series trends.
    # Let's pull out a list of the columns to be used. 
    columns = ['Date', 'Open', 'High', 'Low', 'Close']
    
    # Cool! Now we'll use the candlestick_ohlc function (which is under the mpl_finance package)
    # and we pass through the ax object, and provide the plot with the date, and ohlc trade prices values
    # It's great to adjust the width of the "candles" so I use width equals 0.5 and set the color
    # for bullish (colorup) to be green
    candlestick_ohlc(ax,
                      stock_group[columns].values,
                      width = 0.5,
                      colorup = 'g')
    ax.xaxis_date()
    # Let's create a legend for each plot using a blue star as a logo for each company
    blue_star = mlines.Line2D([], [], color = 'blue', marker = '*',
                               markersize = 15, label = company_list[i])
    # and make the blue star as handles and set the legend at the lower right corner
    plt.legend(handles = [blue_star], loc='lower right')
    
    # and x y axis labels be Date and price respectively
    plt.xlabel("Date")
    plt.ylabel("Price")
    #plt.savefig('my_figure.png')
    # and show the graph
plt.show()

Cool! You can see the direction the stock price moved during the 3-year time frame of the candle by the color and positioning of the candlestick. When the candlestick is green, the price moves upward; when the candlestick is red, the price moves downward.

You can clearly see a majority of the candles are red during the year 2008. This reflects the overall pattern that the stock prices for all of the 5 tech companies plummetted from the beginning of 2008 to the end of the first quarter of 2009 by seeing many red candles. Apple Inc. and IBM recovered the loss in stock prices by the end of the 2010, while the rest of the three companies still traded lower than the beginning of 2008 at the end of the 2010. The size of the candles is also important. So I will invite you to make your own interpretations.

In [64]:
# If you are interested about making our own comical info-graphics, you can specify your own 
# xkcd-style plot by wrapping your plotting code with the with plt.xkcd() statement
with plt.xkcd():
    # So we'll just copy across all of our plotting information and pass through the xkcd formatting filter 
    fig = plt.gcf()
    fig.set_size_inches(15, 18)

    for i in range(len(company_list)):
        stock_group = stock_data.groupby('Ticker').get_group(company_list[i])
        ax = fig.add_subplot(5, 1, i+1)
        columns = ['Date', 'Open', 'High', 'Low', 'Close']
        candlestick_ohlc(ax,
                          stock_group[columns].values,
                          width = 0.7,
                          colorup = 'g')
        ax.xaxis_date()
        blue_star = mlines.Line2D([], [], color = 'blue', marker = '*',
                                   markersize = 15, label = company_list[i])
        plt.legend(handles = [blue_star], loc='lower right')
        plt.xlabel("Date")
        plt.ylabel("Price")

In this lecture I introduced you to visualize time-series data using candlestick plots, which has a number of important use cases for recognizing trade patterns. We particularly went through some useful matplotlib tricks such as mdates and mlines. You don't have to use all these for your assignments. I actually applied quite a bit myself out of the curiosity of understanding precipitation and solar level in my hometown Macau. So it's useful to take time practicing how to create stylish graphs, annotations in graphical visualizations since it's a pretty powerful way to interact with data and to interact with others as data scientists.