Getting financial data these days is not as hard as it perhaps used to be. If you like to play around with it and see if you can do anything useful with it, you may use what I`m gonna give you here. I`m gonna show you simple way of getting S&P 500 data by scraping wikipedia for the list of symbols, and using Quandl API to fetch the data.
This used to be very easy, we had Yahoo finance and scripts to download the test data to your db were on every corner. Those days are gone, since Yahoo link got broken and it doesn`t seem that it is coming back any time soon. Quandl, however, is great replacement, that is getting better every month it seems. They support Python, which is great, as they make our lives easy with that. But not too easy, of course, there are some subtleties of how to get that data. To overcome those (it is not that many of them) I wrote this little script. Hope you find it helpful:
from sqlalchemy import create_engine
"""Imports the list of symbols from wikipedia, and appends 'WIKI/', to
make them readable by Quandl API."""
response = requests.get(
soup = bs4.BeautifulSoup(response.text)
symbolslist = soup.select('table').select('tr')[1:]
symbols = 
for i, symbol in enumerate(symbolslist):
tds = symbol.select('td')
symbols.append('WIKI/' + tds.select('a').text)
"""Gets data for each symbol using quandl and downloads them
into postgresql database. Important to note here, is that you
need quandl api key (which is free).
Before we send the request, we need to replace . by _ and remove last
underscore, to make it readable for quandl API.
Sometimes Quandl API crashes or sends the error response, so for it not
to stop our whole script, we sometimes need to repeat the requests untill
we get the data. Be aware, this is not the best solution, it works for me,
but you may wanna look into it and change up the code in the while loop."""
engine = create_engine(
quandl.ApiConfig.api_key = "Your Quandl key here!"
for s in symbols:
s = s.replace('.', '_')
if s[-1] == '_':
s = s[:-1]
daily_data = quandl.get(
s, collapse="daily", start_date="2000-1-1"
print(str(s) + ' attempting again...')
daily_data.to_sql(name=s, con=engine, if_exists='replace', index=True)
print(s + ' done...')