Quandl is the new, better, Yahoo finance

Getting financial data these days is not as hard as it perhaps used to be. If you like to play around with it and see if you can do anything useful with it, you may use what I`m gonna give you here. I`m gonna show you simple way of getting S&P 500 data by scraping wikipedia for the list of symbols, and using Quandl API to fetch the data.

A bit of history

This used to be very easy, we had Yahoo finance and scripts to download the test data to your db were on every corner. Those days are gone, since Yahoo link got broken and it doesn`t seem that it is coming back any time soon. Quandl, however, is great replacement, that is getting better every month it seems. They support Python, which is great, as they make our lives easy with that. But not too easy, of course, there are some subtleties of how to get that data. To overcome those (it is not that many of them) I wrote this little script. Hope you find it helpful:

                    #!/usr/bin/env python3
                    import time
                    from sqlalchemy import create_engine
                    import quandl
                    import bs4
                    import requests
                    """Imports the list of symbols from wikipedia, and appends 'WIKI/', to
                    make them readable by Quandl API."""
                    response = requests.get(
                    soup = bs4.BeautifulSoup(response.text)
                    symbolslist = soup.select('table')[0].select('tr')[1:]
                    symbols = []
                    for i, symbol in enumerate(symbolslist):
                        tds = symbol.select('td')
                        symbols.append('WIKI/' + tds[0].select('a')[0].text)
                    """Gets data for each symbol using quandl and downloads them
                    into postgresql database. Important to note here, is that you
                    need quandl api key (which is free). 
                    Before we send the request, we need to replace . by _ and remove last
                    underscore, to make it readable for quandl API. 
                    Sometimes Quandl API crashes or sends the error response, so for it not
                    to stop our whole script, we sometimes need to repeat the requests untill 
                    we get the data. Be aware, this is not the best solution, it works for me,
                    but you may wanna look into it and change up the code in the while loop."""
                    engine = create_engine(
                    quandl.ApiConfig.api_key = "Your Quandl key here!"
                    for s in symbols:
                        s = s.replace('.', '_')
                        if s[-1] == '_':
                            s = s[:-1]
                        while True:
                                daily_data = quandl.get(
                                    s, collapse="daily", start_date="2000-1-1"
                                print(str(s) + ' attempting again...')
                        daily_data.to_sql(name=s, con=engine, if_exists='replace', index=True)
                        print(s + ' done...')