Yan W - Data Analysis & Web Development Portfolio
Stock Analysis Project — Visit Site
This project involved an end-to-end ETL pipeline that aimed to extract, transform and load stock data for visualization on the web.
Moreover, the process uses a Python technology stack that utilizes Scrapy to web crawler today's Top 100 volume stocks from TMX Money, with Pandas for cleansing and transforming, MySQL for storing in a well-formatted manner, as well as Django for back-end management, and Plotly for visualization.
ETL Workflow: By combining the extraction, transformation, and loading processes into a single unified Python script, which optimizes the update period without compromising clarity and efficiency.
Visualization and Web Integration: Using Django for seamless data processing and Plotly to create dynamic charts that display stock trends and trading volume efficiently.
As the current ETL process must be run manually on a daily basis, therefore future development will explore integrating Celery Distributed Tasks to automate the ETL process as well as send periodic database updates.
Furthermore, since the current focus is limited to the Top 100 most traded stocks, which may not fully satisfy users with interest in a wider range of market data. Therefore, the future plan is to incorporate additional data sources that cover multiple markets and a wider range of stock indicators.
Financial Risk Project — Visit Site
This project used machine learning to predict loan approval outcomes and assess financial risk. Moreover, it integrated data processing, predictive modeling, and dynamic visualization.
Specifically, the project used Pandas to clean and transform data, and TensorFlow with cross-validation to develop and evaluate neural network models. Furthermore, Plotly is used to create interactive visualizations that display the distribution of user data over historical data.
Loan Prediction Modeling: Using key financial parameters such as credit score, income, loan amount, and debt to predict loan approval status and related probabilities.
Interactive Visualization: Using Plotly to produce charts that illustrate the placement of user inputs in the distribution of historical financial data, which can provide a clear overview of financial risk.
Combining additional financial parameters and more advanced feature engineering techniques to improve the projection accuracy.
Furthermore, by automating data updates and improving the interactive visualization to provide more detail.