Data Inspection

SBC data workshop, presented by An Bui (PhD student), Kyle Emery (Assistant Researcher), and Li Kui (SBC Information Manager)

Objectives

  • Develop skills in inspecting and visualizing data using R.
  • Subsetting and visualizing geospatial data in NetCDF format.
  • Leverage AI tools like ChatGPT to enhance data inspection efforts.

An Bui – Downloading and visually exploring biological datasets using SBCLTER time series data with R

By the end of this section, you will be able to:

  • Use R packages to download LTER data to your own computer.
  • Visualize data as a first step to exploring LTER datasets.

Participants are provided with an R script and a Quarto document to follow along or live code. The rendered Quarto document is here .

Kyle Emery – Downloading and processing kelp canopy dataset for a specific area of interest using R

Steps for processing and visualizing dataset:

  1. Download the canopy dataset and extract key variables.
  2. Set a bounding box and subset data within that area.
  3. Visualize the extent of the data to ensure the correct area was subset.
  4. Plot the data as a map and scale the display by canopy biomass.

Participants are provided with an R script to follow along or live code.

Li Kui – Leveraging the AI tool ChatGPT to broaden and speed up data exploration

Demonstrating how ChatGPT can assist with data quality checking, cleaning, and quick visualization — all without writing any code. The showcase includes tasks such as:

  • Generating column summaries
  • Listing unique values
  • Creating time series plots with regression lines
  • Subsetting data
  • Converting data between wide and long formats

Key Takeaways:

  • ChatGPT serves as a supportive tool, not a standalone solution.
  • Effective prompts with sufficient detail are crucial for maximizing ChatGPT’s utility.

If you have any questions, please contact An Bui, Kyle Emery, or Li Kui at the Marine Science Institute, UCSB.