The Unit for Data Science is a hub for research collaborations and student mentorship. It is a one-of-a-kind resource at ASU Library that connects students, faculty, and staff from all university-wide disciplines. By working with the unit, both students and faculty alike have grown their knowledge of data science and increased its impact on their work.

We encourage you to play with some of the datasets the Unit for Data Science and Analytics is currently working with. The datasets presented in this collection represent the collaborative projects and training opportunities the Unit for Data Science pursues in data science research that engages machine learning, data analytics, visual storytelling, network analysis, and text and data mining.

Feel free to use these datasets as part of your research, coursework, or just for fun! If you have any questions, use the contact link, and the data science team will happily respond.

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 3 of 3 Results
Jun 20, 2022
Little, David, 2022, "Microplastics Images dataset", https://doi.org/10.48349/ASU/ZCEM6W, ASU Library Research Data Repository, V1
This dataset is a collection of images of microplastics. Microplastics are small fragments of plastic (<5mm) that potentially have a negative impact on our health and the environment. Suggested dataset uses are for image classification, image segmentation, or any other image proc...
Jun 20, 2022
Little, David, 2022, "WallStreetBets Subreddit dataset", https://doi.org/10.48349/ASU/WLV8JA, ASU Library Research Data Repository, V1, UNF:6:9xDozy1E0i7ZAlZqh9EXEQ== [fileUNF]
This dataset is an extract of the subreddit /s/wallstreetbets from the website Reddit.com. It contains all of the non-deleted posts from all of January and February 2021. Suggested uses for this dataset is great for all types of natural language processing (NLP).
Jun 20, 2022
Little, David, 2022, "Homeless Management Information System (HMIS) Usage", https://doi.org/10.48349/ASU/NJOZNU, ASU Library Research Data Repository, V1, UNF:6:lVgMtOHGB4/ktKdlxSzdyA== [fileUNF]
This dataset is derived from the Homeless Management Information System (HMIS), which is a government-run database to collect client-level data on housing and services to homeless individuals and families. This particular dataset counts the number of times each homeless individua...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.