Xiao Xiao
Airbnb has revolutionized the short-term rental market, providing travelers with diverse accommodation options while allowing hosts to monetize their properties. In major cities like Toronto, understanding the pricing dynamics, occupancy trends, and key factors influencing Airbnb listings is crucial for both hosts and guests. This study leverages Airbnb's official dataset to explore key market insights and develop a rent forecasting model based on various property attributes and user reviews.
The main goal of this study is to analyze the structure and characteristics of Airbnb market in Toronto, so as to deeply understand the key factors affecting rent pricing, customer satisfaction, landlord performance and market trends.
Specifically, this analysis can help us answer the following questions:
Through the above analysis, we can provide valuable insights for tenants, landlords and policy makers on Airbnb platform, help optimize market strategies and improve user experience.
Data Source: https://insideairbnb.com/get-the-data/
First, delete variables that have no analytical significance, such as 'listing _ URL', 'scrap _ id' and so on. Then delete the variables whose number of null values exceeds half of the total data length, and the missing values of these variables will lead to inaccurate analysis results.
Numbers are extracted from the 'bathrooms_text' column and filled into the 'bathrooms' column as bathrooms quantity data. Then, for numerical variables, according to their respective distribution rules, the median or average value is used to fill the corresponding null values, and unnecessary unit symbols, such as '%' and '$', are removed to improve data integrity.
Then check and convert the data types of each variable to make it easy to analyze. Finally, duplicate values are deleted and abnormal values are removed to improve the accuracy of data analysis.
To analyze the factors influencing Airbnb rental prices in Toronto, we applied multiple data visualization and statistical analysis techniques, including histogram distribution analysis, correlation heatmap, and interactive geographical mapping.
