pycancensus.get_census

pycancensus.get_census(dataset: str, regions: Dict[str, str | List[str]], vectors: List[str] | None = None, level: str = 'Regions', geo_format: str | None = None, resolution: str = 'simplified', labels: str = 'detailed', use_cache: bool = True, quiet: bool = False, api_key: str | None = None) DataFrame | GeoDataFrame[source]

Access Canadian census data through the CensusMapper API.

This function allows convenient access to Canadian census data and boundary files through the CensusMapper API. An API key is required to retrieve data.

Parameters:
  • dataset (str) – A CensusMapper dataset identifier (e.g., ‘CA16’, ‘CA21’).

  • regions (dict) – A dictionary of census regions to retrieve. Keys must be valid census aggregation levels.

  • vectors (list of str, optional) – CensusMapper variable names of the census variables to download. If None, only geographic data will be downloaded.

  • level (str, default 'Regions') – The census aggregation level to retrieve. One of ‘Regions’, ‘PR’, ‘CMA’, ‘CD’, ‘CSD’, ‘CT’, ‘DA’, ‘EA’ (for 1996), or ‘DB’ (for 2001-2021).

  • geo_format (str, optional) – Format for geographic information. Set to ‘geopandas’ to return a GeoDataFrame with geometry. If None, returns DataFrame without geometry.

  • resolution (str, default 'simplified') – Resolution of geographic data. Either ‘simplified’ or ‘high’.

  • labels (str, default 'detailed') – Variable label format. Either ‘detailed’ or ‘short’.

  • use_cache (bool, default True) – Whether to use cached data if available.

  • quiet (bool, default False) – Whether to suppress messages and warnings.

  • api_key (str, optional) – API key for CensusMapper API. If None, uses environment variable or previously set key.

Returns:

Census data in tidy format. Returns GeoDataFrame if geo_format=’geopandas’.

Return type:

pd.DataFrame or gpd.GeoDataFrame

Examples

>>> import pycancensus as pc
>>> # Get data for Vancouver CMA
>>> data = pc.get_census(
...     dataset='CA16',
...     regions={'CMA': '59933'},
...     vectors=['v_CA16_408', 'v_CA16_409'],
...     level='CSD'
... )
>>> # Get data with geography
>>> geo_data = pc.get_census(
...     dataset='CA16',
...     regions={'CMA': '59933'},
...     vectors=['v_CA16_408', 'v_CA16_409'],
...     level='CSD',
...     geo_format='geopandas'
... )