Image for nav Data Dives section banner image

Data Dives

The NCSC Data Dives series is a forum to help courts dive deeper into data analytics and gain valuable insights to help solve their data problems. Through a combination of interactive/group and individual/focused sessions, we discuss past projects, innovations and topics of interest to the courts. These discussions also support the strategic future-ready court planning outlined in NCSC’s Just Horizons initiative.

Anticipated future topics:

  • Using Statistical Software to Expand What You Can Do with Data
  • Georeferencing Everything
  • Best Practices When Working with PII Data
  • Working with Audio and Video Data
  • Forecasting your Case Filings

Dive #1: Web Scraping 101, July 2023

Web scraping is the process of extracting data from websites using automated tools or programs to access web pages, read the page’s source code or other structured data on those pages, and extract the desired information.

In this overview, you will learn:

  • The difference between web scraping and web crawling
  • What skills and tools are used for web scraping
  • Common use cases
  • Risks for sites/organizations that have their sites scraped
  • Ways to mitigate risks
  • When web scraping is harmless

A decision tree infographic is also available.

Dive #2: Beyond ChatGPT: How can AI tools help you?, August 2023

This live workshop during the Data Specialists and Information Technologists Summit focused on NCSC's experience using large language models, such as ChatGPT, to extract data from court documents. Our team provided an overview of how ChatGPT works, its limitations and alternative ways to overcome limitations for its use. The workshop also featured a demonstration of a data pipeline which took in PDF documents, performed Optical Character Recognition (OCR) for extracting the text, and then restructured the textual information into a CSV file using ChatGPT.

Learn more by viewing the workshop materials:

Dive #3: Data Storytelling, October 2023

Data storytelling is an approach to presenting information that combines traditional text and visuals, often interactive, to simplify complex information and highlight key insights. The integration of analytics and thoughtfully crafted visuals into an engaging presentation makes this approach especially valuable at persuading targeted audiences and informing decision makers.

In this overview, you will learn:

  • What is data storytelling?
  • What are the key elements?
  • How do you get started?
  • What are the benefits?
  • Are there any disadvantages?
  • What tools or assistance are available?

Dive #4: 5 things to know about Data Governance, November 2023

With the rapidly expanding use of AI and advanced data analytics, the use of data in modern court systems continues to expand at a rapid pace. Data are essential for effective and efficient court operations, case management, strategic planning, policy development, and budgeting. But without a cohesive and well-defined approach to managing court data, inefficiencies, poor data quality, and unnecessary risk exposure will occur. A strong data governance policy helps to prevent inefficiencies and manage risk while ensuring accessibility to reliable data needed for decision making and meeting the courts core objectives.

Dive #5: Record Linkage, April 2024

Courts today deal with large amount of data records, often times spread across different databases. As the conversation about the best methods for managing and analyzing this data advances, our team at NCSC is also exploring ways to streamline this process. Record Linkage is one tool that can help. Record Linkage is a method for matching data records across multiple databases using common identifiers like names and addresses.

In this overview, you will learn:

  • Different ways of doing data linkage
  • Factors to consider
  • Program suggestions to get started on matching records

For more information about Data Dives or NCSC’s Data Initiatives, email Data Scientist Andre Assumpcao.