The following sites might prove useful when searching for datasets related to your research:
Science.gov is an official website of the U.S. government, providing ready access to the massive stores of federally-funded scientific research results, without needing to know which agency funded the research. Research results include scientific and technical reports, peer-reviewed scholarly publications, digital data, software, conference presentations and proceedings, and other scientific and technical information that federal agencies publish resulting from their research investments.
The United States Government's open data site is designed to unleash the power of government open data to inform decisions by the public and policymakers, drive innovation and economic activity, achieve agency missions, and strengthen the foundation of an open and transparent government.
NASA's Earth Science Data Systems (ESDS) Program oversees the life cycle of NASA’s Earth science data—from acquisition through processing and distribution. Our primary goal is to maximize the scientific return from NASA's missions and experiments.
re3data.org is a comprehensive registry of research data repositories that is global and covers all research disciplines.
You can use re3data to find repositories where you can discover and deposit datasets that support research such as experimental and simulation data; images, sound and video; surveys and observations; physical samples; laboratory and clinical trial data; software source code; genomic and geospatial data; and more.
The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
The Harvard Dataverse Repository is a free data repository open to all researchers from any discipline, both inside and outside of the Harvard community, where you can share, archive, cite, access, and explore research data. Each individual Dataverse collection is a customizable collection of datasets (or a virtual repository) for organizing, managing, and showcasing datasets.
DataONE is a community driven program providing access to data across multiple member repositories, supporting enhanced search and discovery of Earth and environmental data. DataONE promotes best practices in data management through responsive educational resources and materials. We envision researchers, educators, and the public using DataONE to better understand and conserve life on earth and the environment that sustains it.
RCSB PDB is the US data center for the global Protein Data Bank (PDB) archive of 3D structure data for large biological molecules (proteins, DNA, and RNA) essential for research and education in fundamental biology, health, energy, and biotechnology.
Access data related to the population of the United States.
The CERN Open Data portal is the access point to a growing range of data produced through the research performed at CERN. It disseminates the preserved output from various research activities and includes the accompanying software and documentation needed to understand and analyze the data. The portal adheres to established global standards in data preservation and Open Science: the products are shared under open licenses; they are issued with a Digital Object Identifier (DOI) to make them citable objects.
This is a list of repositories and databases for open data.