Data Science: Project Discussion

Hi all,

Our next session will be held on Saturday 15th July at 15:00 at Starbucks, El Corte Inglés, Av. Federico Soto, 1, 03003 Alicante.

This session will be a 4-hour session which will recap on the principles of OLS covered in the last session, as well as introduce other forms of regression analysis, including time series based models.

As well as this, this session will be about laying the foundations for a data science project that we would collectively undertake. The consensus so far has been that stock price analysis has been the favoured topic to analyse. So, here is what I envisage (and feel free to contribute your own thoughts as well).

Project: Develop a time series model to predict future movements of Apple stock.

Methodology and Steps:

1. Manually design a stock price database using a language such as mySQL/PostgreSQL, etc. While in the real world, this sort of data would be streamed from an independent database, there are many times where a database needs to be prepared from scratch. (While I’ve done some database creation before, my experience is more in statistics and machine learning, so anyone with specialised database knowledge would be very useful for this part of the project).

2. Using R and Python, develop a time series model (such as ARIMA) which can be used to forecast future values of AAPL. Other models can also be discussed and we may find that we have better ideas once we discuss more.

How we could structure this:

A good way to undertake this project could be to devise three separate teams.

Team 1: Develop the stock price database using mySQL/PostgreSQL, etc.

Team 2: Implement modelling techniques to forecast AAPL stock using R.

Team 3: Implement modelling techniques to forecast AAPL stock using Python.

Again, a lot of this depends on the number of people interested and also the skill sets of each member. But, this is an initial structure and we can discuss it further once we meet.

Also, the teams won’t necessarily work in isolation – we will still compare results and learn from each other.

Any questions or suggestions please don’t hesitate to reach out – look forward to seeing you on the 15th!

Also, here is our Github and Slack pages:

Github: https://github.com/AlicanteDataScience

Slack: https://join.slack.com/alicante-data-science/shared_invite/MjA3NzUyNTk2ODgzLTE0OTkyMDc1MjYtYjdhYjQ5MGQwMw

Author: Jeroen Derks

Jeroen is the founder of the Alicante Tech meetup group. His current day job is to mostly build all kinds of applications, ranging from IoT to educational to corporate.