The Unit for Data Science is a hub for research collaborations and student mentorship. It is a one-of-a-kind resource at ASU Library that connects students, faculty, and staff from all university-wide disciplines. By working with the unit, both students and faculty alike have grown their knowledge of data science and increased its impact on their work.

We encourage you to play with some of the datasets the Unit for Data Science and Analytics is currently working with. The datasets presented in this collection represent the collaborative projects and training opportunities the Unit for Data Science pursues in data science research that engages machine learning, data analytics, visual storytelling, network analysis, and text and data mining.

Feel free to use these datasets as part of your research, coursework, or just for fun! If you have any questions, use the contact link, and the data science team will happily respond.

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 10 of 11 Results
Mar 19, 2026
Abbasov, Namig; Mehta, Hetavi Dilip; Dwivedi, Vishnu; Abhiramacheri, Vaibhav; Panda, Abhipsa; Rathod, Mohak Narendrakumar; Chandrasekhar, Tejas; Ndegwa, Martin Mwangi; Zhang, Fan; Mysore Srinidhi, Chinmayi; Patel, Puravkumar; Tang, Wei Chieh; Bhatia, Sargun; Vongsenekeo, Tylor; Batra, Garima; Huang, Yaqing, 2026, "A Corpus of Artificial Intelligence Policies from US R1 Research Universities", https://doi.org/10.48349/ASU/VOYGW9, ASU Library Research Data Repository, V1, UNF:6:f3nmKdlJQZ3f7roPbZUNCw== [fileUNF]
This corpus contains official artificial intelligence (AI) policy documents issued by R1 universities across the United States. It includes guidelines on AI use in teaching and learning, academic integrity, research practices, data privacy, governance, and ethical considerations. The collection reflects how institutions are responding to generative...
ZIP Archive - 477.7 KB - MD5: a1f884daebd9d78837148cf8ba1a46f2
Adobe PDF - 146.1 KB - MD5: f26a3cfc18b6e506940ae0f00cc76ba4
Tabular Data - 19.3 KB - 9 Variables, 133 Observations - UNF:6:f3nmKdlJQZ3f7roPbZUNCw==
Plain Text - 4.3 KB - MD5: 8fb08d24222d5e94471761717771c487
Jun 20, 2022
Little, David, 2022, "Microplastics Images dataset", https://doi.org/10.48349/ASU/ZCEM6W, ASU Library Research Data Repository, V1
This dataset is a collection of images of microplastics. Microplastics are small fragments of plastic (<5mm) that potentially have a negative impact on our health and the environment. Suggested dataset uses are for image classification, image segmentation, or any other image processing tasks. ZIP file contains color images. Size: 34.1 MB Type: JPEG
ZIP Archive - 31.5 MB - MD5: 498355f06fbdee1cd5a9186df08b1e92
Jun 20, 2022
Little, David, 2022, "WallStreetBets Subreddit dataset", https://doi.org/10.48349/ASU/WLV8JA, ASU Library Research Data Repository, V1, UNF:6:9xDozy1E0i7ZAlZqh9EXEQ== [fileUNF]
This dataset is an extract of the subreddit /s/wallstreetbets from the website Reddit.com. It contains all of the non-deleted posts from all of January and February 2021. Suggested uses for this dataset is great for all types of natural language processing (NLP).
Tabular Data - 663.7 KB - 11 Variables, 774 Observations - UNF:6:9xDozy1E0i7ZAlZqh9EXEQ==
Jun 20, 2022
Little, David, 2022, "Homeless Management Information System (HMIS) Usage", https://doi.org/10.48349/ASU/NJOZNU, ASU Library Research Data Repository, V1, UNF:6:lVgMtOHGB4/ktKdlxSzdyA== [fileUNF]
This dataset is derived from the Homeless Management Information System (HMIS), which is a government-run database to collect client-level data on housing and services to homeless individuals and families. This particular dataset counts the number of times each homeless individual (rows) attends each of the different services/projects (columns) ava...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.