Twitter Geospatial Data
Donated on 9/11/2024
Seven days of geo-tagged Tweet data from the United States with exact GPS location and timestamp.
Dataset Characteristics
Multivariate, Time-Series, Spatiotemporal
Subject Area
Social Science
Associated Tasks
Classification, Regression, Clustering
Feature Type
Real, Categorical, Integer
# Instances
14262517
# Features
4
Dataset Information
Additional Information
Note that this is the full week of data that was sampled from Twitter. The 10,005,301 count mentioned in the introductory paper below refers to the weekday portion of the data (i.e., Monday through Friday). If you remove Saturday (Jan 12, 2013) and Sunday (Jan 13, 2013), then you will get the Monday through Friday portion that was analyzed in the paper.
Has Missing Values?
No
Introductory Paper
By Nathaniel E. Helwig, Yizhao Gao, Shaowen Wang, Ping Ma. 2015
Published in Spatial Statistics
Variable Information
This dataset contains geospatial and timestamp data for one week worth of Tweets in the contiguous United States. The Tweets were created between January 12, 2013 and January 18, 2013. The exact location (i.e., longitude and latitude) and timestamp (hour, minute, second) of each Tweet was recorded. All timestamps are reported in central standard time in the format "YYYY-MM-DD HH:MM:SS". The geo-tag information was used to assign each Tweet to one of the four standard time zones (for details see Helwig et al., 2015). The data were collected by the CyberGIS Center for Advanced Digital and Spatial Studies at the University of Illinois at Urbana-Champaign. Details on the data preprocessing and analysis can be found in Helwig et al. (2015).
Class Labels
1. longitude: exact longitude coordinate of Tweet (real valued) 2. latitude: exact latitude coordinate of Tweet (real valued) 3. timestamp: 20130112000000 = 2013-01-12 00:00:00 CST (integer) 4. timezone: 1 = Eastern, 2 = Central, 3 = Mountain, 4 = Pacific (integer)
Dataset Files
File | Size |
---|---|
twitter.zip | 179.5 MB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset twitter_geospatial_data = fetch_ucirepo(id=1050) # data (as pandas dataframes) X = twitter_geospatial_data.data.features y = twitter_geospatial_data.data.targets # metadata print(twitter_geospatial_data.metadata) # variable information print(twitter_geospatial_data.variables)
Helwig, N., Gao, Y., Wang, S., & Ma, P. (2015). Twitter Geospatial Data [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5RS5J.
Creators
Nathaniel E. Helwig
Yizhao Gao
Shaowen Wang
Ping Ma
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.