Twitter Geospatial Data

Donated on 9/11/2024

Seven days of geo-tagged Tweet data from the United States with exact GPS location and timestamp.

Dataset Characteristics

Multivariate, Time-Series, Spatiotemporal

Subject Area

Social Science

Associated Tasks

Classification, Regression, Clustering

Feature Type

Real, Categorical, Integer

# Instances

14262517

# Features

4

Dataset Information

Additional Information

Note that this is the full week of data that was sampled from Twitter. The 10,005,301 count mentioned in the introductory paper below refers to the weekday portion of the data (i.e., Monday through Friday). If you remove Saturday (Jan 12, 2013) and Sunday (Jan 13, 2013), then you will get the Monday through Friday portion that was analyzed in the paper.

Has Missing Values?

No

Introductory Paper

Analyzing spatiotemporal trends in social media data via smoothing spline analysis of variance

By Nathaniel E. Helwig, Yizhao Gao, Shaowen Wang, Ping Ma. 2015

Published in Spatial Statistics

Variable Information

This dataset contains geospatial and timestamp data for one week worth of Tweets in the contiguous United States. The Tweets were created between January 12, 2013 and January 18, 2013. The exact location (i.e., longitude and latitude) and timestamp (hour, minute, second) of each Tweet was recorded. All timestamps are reported in central standard time in the format "YYYY-MM-DD HH:MM:SS". The geo-tag information was used to assign each Tweet to one of the four standard time zones (for details see Helwig et al., 2015). The data were collected by the CyberGIS Center for Advanced Digital and Spatial Studies at the University of Illinois at Urbana-Champaign. Details on the data preprocessing and analysis can be found in Helwig et al. (2015).

Class Labels

1. longitude: exact longitude coordinate of Tweet (real valued) 2. latitude: exact latitude coordinate of Tweet (real valued) 3. timestamp: 20130112000000 = 2013-01-12 00:00:00 CST (integer) 4. timezone: 1 = Eastern, 2 = Central, 3 = Mountain, 4 = Pacific (integer)

Dataset Files

FileSize
twitter.zip179.5 MB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (179.6 MB)
1 citations
1514 views

Creators

Nathaniel E. Helwig

Yizhao Gao

Shaowen Wang

Ping Ma

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy