BC Data Service takes a chatbot to work
|

In 2024, tech headlines proclaimed that generative artificial intelligence (gen AI) will change the way people work. To get some data behind this theory, the BC Data Service Division built a pilot program putting chat assistant technology based on gen AI through its paces.
The BC Data Service Division (BCDS) focuses on exploring new ways data can support decision-making. With gen AI stories and models showing up everywhere, BCDS saw great potential to transform access to information and how services are delivered. This new technology seemed to be flourishing – but we didn’t know if we were allowed to use it, how to use it, or if it was valuable.
Given gen AI is basically data condensed into a model and then repackaged into a tool, getting hands-on experience with the technology to see its capabilities and limitations was an irresistible opportunity. We wanted to know how it worked, how our clients might use it, how we might use it within the OCIO’s interim use guidelines and what issues it might pose.
So, we developed a plan to see how others were using these tools and then tried them ourselves on some of our own tasks.
Gathering context
Before we started our pilot some early adopters in the division were already comfortably using AI tools and paying for them out of their own pockets, but most people found the interim guidance unclear, and others were hesitant or even reluctant to try them.
We began by reading other studies and theorizing potential use cases for the tools. We decided to try two different tools in the pilot:
- ChatGPT (a general AI assistant)
- GitHub Copilot (a product integrated with GitHub and common development tools like VS Code, targeting the coding community)
Our incomplete understanding of the different AI tools, the rapid pace of their evolution and the uncertain value of untested products meant we weren’t ready to commit many resources to mass procurement. Instead, we opted for a small pilot with a limited duration, where the focus would be on learning, exploring and identifying business value. The pilot was not designed to evaluate products – first we needed to get to know what these categories of products even were and what they might offer in terms of utility.
Pilot design
Pilot participants from our division would be tasked with using the paid version of ChatGPT and following current guidelines. The pilot was carefully designed to ensure participants understood the software and our obligations as BC Public Service employees. They would focus on trying and sharing use cases that might help us get our work done more efficiently, while also taking care not to disclose personal and confidential information when using the tool.
We selected 30 participants in the division across a wide range of roles and experience levels. Those participants have spent the past five months learning how to use AI chat assistants, testing use cases relevant to their roles and helping us measure if these tools can add value to our work.
By the numbers
68% of participants said using gen AI moderately to significantly improved the quality of their work.
69% of participants said it increased their job satisfaction.
Participants estimated an average of 4.6 hours saved per week.
Top use cases at least one participant named their first choice, from most common to least common:
- Ideating and outlining
- Summarizing
- Improving written communication
- Technical support
- Coding assistance
- Data analysis assistance
Preparing the pilot
The Information Management Branch supported us in building a Privacy Impact Assessment and guided us to seek approvals from IPO and the Legal Services and Risk Management Branches for the scope of the pilot. It was a significant effort to review and consider the terms, given the potential scope of the technology and sensitivity around the use of AI. We greatly appreciated the careful consideration that allowed us to feel confident and safe in our pilot process.
We assembled a group of 30 people to learn together as a cohort for six months and assess the business value of gen AI chat assistants. Staff participating in the pilot included communications and engagement specialists, system engineers, policy analysts, service managers, procurement analysts, data scientists and senior management.
The participants ranged from early adopters and excited enthusiasts to reluctant users who had never tried gen AI and were more skeptical. We believe the collaborative community structure of the project was very helpful, as it brought together people with different backgrounds, job functions and levels of enthusiasm (or pessimism) to help each other learn about the tool.
Since the group’s knowledge on how to use AI was so varied, training was offered to help them get the most out of the experience, prevent the disclosure of confidential information and ensure that results were carefully vetted for accuracy.
What we’ve found
Over the course of the pilot, we’ve learned what really worked, what didn’t, what people liked and what they disliked. Results from a midway check-in with participants suggest that using gen AI has some real business value for a variety of scenarios, even when using a free version. Perhaps unsurprisingly, our findings also suggest that the additional paid features are not suited to all use cases or all users.
What worked well
- The most common use cases involved using gen AI to summarize information, improve written materials and outline new work
- More creative uses included developing synthetic (“dummy”) data for experimentation, or converting hand-written notes from a workshop into digital text which the tool could then group, summarize and quantify
- ChatGPT’s ability to validate its own generated code and ensure functionality before providing it as a solution was impressive. It can also handle data files directly, creating visualizations and generating corresponding code for analysis from simple plain language instructions.
What didn’t work well
- In most cases gen AI was able to provide us with responses that sounded reasonable, but it was not reliable enough that we could trust it for basic research on unfamiliar topics
- Even asking it to respond based on information recently provided to it – like a document previously uploaded to the chat – had mixed success
- While staff honed their prompting skills by reviewing online resources, providing sufficient context to ensure a valuable response was reported as a challenge by a few participants.
- Some staff have also suggested that they would benefit from further training in prompt creation as they are not full confident – so more investment in learning is required.
Chat assistants are designed to generate persuasive, high-quality responses in natural language. But the underlying models come with bias and flaws, so their output should never be taken at face value. The importance of being on guard for these flaws and to challenge model responses was important for our participants to learn. Having a skeptical human, persistently fact checking, is an important aspect when using these assistant tools, particularly when the results provided could influence a serious decision.
Staff were also challenged by not being able to use confidential information. The ChatGPT team licence we purchased from Open AI stated that our inputs would not be used for model training… but it still allowed the company (and their contractors) access to the prompts provided. Our participants were able to get around this by stripping confidential information from materials like correspondence and reports before submitting them to ChatGPT or building test materials from scratch. However, we will be particularly excited if a solution appears that would allow us to securely use sensitive or confidential information directly.
Working with chat assistants
To get the best responses from a chat assistant and reduce errors in the results:
- Clarify your questions as much as possible. Be specific and provide sufficient context
- Probe the answers you receive. Ask the assistant how much confidence it has in its response, and what sources it used
- Cross check the results against a range of verified sources
What comes next
As we get close to the end of our pilot this spring, participants are comparing their ChatGPT experiences with Microsoft’s Copilot Chat (which is available for free to all staff as part of our Microsoft Enterprise 365 licence).
Several staff in our division are also about to embark on a Digital Office trial of GitHub Copilot, an integrated tool aimed at helping people code. Based on our limited experience with using ChatGPT for coding, we think this will be very helpful from a productivity perspective.
The staff who participated in our pilot are now very aware of the technology’s possible uses and potential challenges, including the need to develop good prompting skills. Despite the generally positive outcomes from the pilot, we are not recommending blanket purchasing of ChatGPT licences. We think that for many use cases the freely available chat assistants should be the first choice – especially Microsoft Copilot, which is included in our enterprise software package and has additional data protections. There are certain use cases where a paid model may be better suited, but in our experience the free versions offer adequate functionality in most scenarios.
Overall, the pilot project has been a success. Many participants who were completely reluctant to try the technology became some of our most interested participants. The project reminded us that the best way to become knowledgeable about new things is often to simply try them. We still have more to learn – and are really looking forward to the potential applications of AI coding for software development, data engineering and data science.
For more about how the B.C. government is exploring the possibilities of AI, visit B.C.’s artificial intelligence progress.