This page contains a collection of publicly available datasets, models, and software that might be useful for researchers and professionals who study energy systems and informatics. We continue to add more resources to this page. If you own or know about energy-related resources that are not listed here, please contact the Information Services Director.

Disclaimer: While the information contained in this page is presented with all due care, we do not warrant or represent that it is free from errors or omissions.

Models, Libraries, and Simulation Software

  • OpenDSS is an electric power distribution system simulator
  • Antares-Simulator is an open-source simulator of large interconnected power grids
  • PyPSA is a Python toolbox for simulating and optimizing modern power systems
  • EnergyPlus is a whole building energy simulation program
  • SAM is suitable for detailed performance and financial analysis of renewable energy and storage systems
  • PVWatts is a calculator for estimating the energy production and cost of grid-connected photovoltaic energy systems throughout the world
  • pvlib python is a library for simulating the performance of photovoltaic systems
  • Bird is a clear sky solar irradiance model developed by NREL
  • Pysolar is a collection of Python libraries for simulating the irradiation of any point on earth
  • Solar-TK is a toolkit for solar performance modelling and forecasting
  • WindML is a Python toolkit which provides access to wind data sources for various machine learning tasks
  • Google Sunroof is an online tool for estimating solar savings potential of a given area in the US
  • Renewables.ninja is an online tool for simulating hourly power output from wind and solar PV farms located anywhere in the world
  • GridPV toolbox allows for modelling and simulating the integration of distributed generation into the electric power system
  • Markov models for simulating residential electricity consumption. The models are built using power consumption data collected every 6 seconds from 25 households in Canada
  • LoadProfileGenerator is a modelling tool for residential energy, gas, hot and cold water consumption
  • NILMTK is a toolkit for evaluating the accuracy of Non-Intrusive Load Monitoring algorithms
  • ODToolkit is a toolkit for evaluating the accuracy of occupancy detection and estimation algorithms in buildings
  • CityLearn is an OpenAI Gym environment for the implementation of reinforcement learning agents in a multi-agent urban setting
  • FlexDRL is a Gym environment developed by Lawrence Berkeley National Laboratory for training reinforcement learning-based HVAC control agent
  • COBS is a platform for simulating and benchmarking occupant-centric building controls Reinforcement learning control agents built on top of this platform can be found here
  • Google Smart Buildings Simulator is a tool for training reinforcement learning agents to optimize energy consumption and minimize carbon emissions in buildings
  • Tropical Pre-cooling is an environment for evaluating and benchmarking building energy optimization algorithms
  • Power System Toolbox (PST) is a MATLAB-based simulator for transient simulation. A modified version of PST is available here
  • EV Mobility Model is a Python program for generating daily mobility behaviour of private EVs
  • BuildingsBench is a machine learning research platform for exploring building short-term load forecasting
  • SustainGym is a suite of environments designed to test the performance of single- and multi-agent RL algorithms on realistic sustainability tasks under distribution shift
  • Modelica Buildings Library is an open source library for design and operation of building and district energy and control systems
  • Tin Hau is a time-series foundation model for building load/energy forecasting which embeds the knowledge of thousands of buildings and adopts an advanced training strategy

Datasets

Power Generation & Transmission

  • Historical data on electricity generation and transmission from ENTSO-E, the European Network of Transmission System Operators for Electricity, representing 43 electricity transmission system operators (TSOs) from 36 countries across Europe [RESTful API]
  • Global Power Plant Database is a comprehensive, open source database of power plants around the world
  • Power grid frequency measurements from around the world (mostly focused on Europe) 
  • This website contains several data sources which are helpful for power system modelling in European countries
  • Open Infrastructure Map is a view of the world’s energy infrastructure mapped in the OpenStreetMap database
  • electricityMap provides historical and near real-time data about the carbon intensity of electricity production in many countries [API]

Power Distribution

  • Power Data Portal contains data from Distribution-level Phasor Measurement Units (PMU) installed at several locations, including Lawrence Berkeley National Laboratory
  • Iowa test system is a real distribution grid model with one year smart meter measurements from 1120 customers. The system consists of 3 feeders and 240 nodes.
  • Three synthetic distribution networks (OpenDSS models) created using the U.S. Reference Network Model

Energy Markets

  • PJM’s Data Miner provides access to public data such as generation, load, load forecast, and PJM’s market information
  • Various datasets related to European Energy Market EPEX SPOT
  • Day-ahead and real-time demand, and locational marginal pricing for the entire New England system

Energy Consumption: Whole Building & Plug Loads

  • Dataport contains one-minute appliance-level customer electricity use from nearly 1000 houses and apartments
  • NREL’s end-use load profiles dataset contains 15-minute resolution load profiles for all major residential and commercial building types and end uses across all climate regions in the United States (description)
  • NEST Open Building Data for Energy Demand and User Practice, 4 years of data from 3 buildings with a temporal resolution of 1 minute
  • Smart* Dataset contains electricity consumption data from over 400 homes
  • A three-year dataset from an office building that includes whole-building and end-use energy consumption, HVAC system operating conditions, indoor and outdoor environmental parameters, and occupant counts
  • Building Data Genome project contains hourly whole building electrical meter data from 507 non-residential buildings
  • Building Data Genome II project contains hourly measurements of electricity, heating and cooling water, steam, and irrigation meters of 1,636 non-residential buildings
  • Ground-truth data on the presence and absence of building faults from multiple building system types
  • Mortar is a data collection tool and an open dataset of buildings
  • BBD data portal contains curated and standardized benchmark field datasets representing building operational and indoor/outdoor environmental data from both commercial and residential buildings across multiple United States climate zones
  • REFIT Smart Home dataset contains aggregate-level and appliance-level electrical load measurements, building occupancy, and survey data from 20 UK homes
  • ADRES-Concept contains power consumption and voltage profiles of 30 Austrian households
  • Longitudinal data on building occupant behaviour, comfort, and environmental conditions from an air-conditioned office building in Pennsylvania, United States
  • Real and reactive power consumption of 390 US apartments measured at 10-second intervals
  • EERE dataset contains commercial and residential hourly electricity load profiles for all TMY3 locations in the United States
  • Data from 45 SPOT* Personal Thermal Comfort System installed in 15 offices can be downloaded from here
  • Camera-based occupancy detection dataset from a research lab
  • SEIL-R is a dataset of electricity consumption by residences of a multi-storey building at 1-second granularity
  • ComEd’s anonymized smart meter data in 30-minute intervals for all zip codes in Illinois
  • REDD is the reference energy disaggregation dataset
  • UK-DALE is domestic appliance-level electricity dataset from UK
  • AMPds is one-minute electricity, water, and natural gas consumption data of a residential house in Canada
  • BLOND contains continuous energy measurements of a typical office environment in Germany with common appliances and load profiles
  • BLUED electricity disaggregation dataset contains current and voltage measurements from one home in the United States
  • I-BLEND dataset contains energy consumption data recorded at one-minute intervals from seven campus buildings in India
  • EMBED is a labeled electricity disaggregation dataset containing plug load consumption data (for a variety of appliances) from three apartment units in the United States
  • CREAM is a component-level voltage and current measurement dataset for two industrial-grade coffeemakers that comes with expert-labelled component-level electrical events
  • WHITED is a dataset of appliance transients including 47 different types of appliances in 6 regions
  • LILACD includes high frequency measurements from one, two, and three concurrently running appliances of 15 different types
  • Sub-hourly measurements of energy use and indoor climate from 6 real buildings in US, Canada, Denmark, Norway, and Singapore
  • TDC 1.0 is an air free-cooled tropical data center dataset includes sensor measurements collected from an air free-cooled data center testbed with an air free-cooling system in 2018 and 2019
  • TDC 2.0 is an air-cooled tropical data center dataset includes the sensor measurement traces collected from an air-cooled data center testbed with a direct expansion cooling system in 2022 and 2023
  • Google Smart Buildings dataset is a comprehensive collection of six years of telemetry data from three Google buildings, providing real-world insights for developing and validating optimal control solutions

Lighting for Buildings

  • The office light level dataset contains measurements of the available daylight in four unoccupied offices in a campus building in Waterloo, Canada

Building Footprint

Water Consumption

Electric Vehicles and Bikes

  • ACN-Data is a dataset of workplace EV charging which currently includes over 30,000 sessions
  • Charging duration and demand data from all city-owned electric vehicle (EV) charging stations in Boulder, CO
  • ElaadNL open dataset contains information about 10,000 EV charging transactions completed at public charging stations operated by EVnetNL in the Netherlands
  • Geographical location of public electric vehicle charging stations in the United States and Canada
  • US National Household Travel Survey (NHTS) data
  • Drive4Data contains drive cycle and powertrain data (speed, acceleration, and battery state-of-charge) of several EVs from Waterloo, ON, Canada
  • EV-SDG is a parametric model to generate synthetic EV charging sessions
  • WeBike is an electric bike dataset which contains information about trips and battery charging sessions of 31 e-bike riders
  • OpenStreetMap Mobility Demand Generator is a tool for generating EV mobility patterns for a given location
  • SPAGHETTI is a synthetic data generation tool for post-COVID EV patterns

Renewable Energy Resources

  • Ausgrid solar home dataset contains half-hourly gross solar generation, net controlled load and general electricity consumption measurements for 300 homes with solar panels installed. This data is available for 3 continuous years and the solar panel capacity is also recorded for each home
  • California Distributed Generation Statistics provides solar PV net energy metering data from 3 large utilities
  • Renewable energy power stations in European countries
  • Solar radiation data from several measurement sites in the United States: MIDC
  • Wind data from the National Wind Technology Center at NREL can be downloaded from this portal
  • Measurements from a rooftop solar installation spanning time windows when panels were fully and partially covered with snow

Climate and Weather

  • Several climate and weather datasets are listed on this website as part of the Pangeo project
  • Python package providing an API for loading and manipulating the “Subseasonal Climate USA” dataset
  • OPSD‘s hourly geographically aggregated weather data for Europe
  • Meteostat is one of the largest vendors of open weather and climate data (Python library)
  • Dark Sky weather API
  • A free repository of climate data for building performance simulation

Energy Storage

  • A model for calculating Levelized Cost of Storage for 9 technologies in 12 applications from 2015 to 2050