Stay organized with collections
Save and categorize content based on your preferences.
Implement regex-based sentence splitting for function completion.
You can query a model directly and test the results returned when using
different parameter values with the Cloud console, or by calling the
Vertex AI API directly.
I've been asked to finish work on this function that splits sentences into a list. Unfortunately, I've been specifically told to use regex for this, with extra requirements explained in the function's docstring. Regex is a pain in the butt at the best of times - could you complete this code for me? Everything you need should be inside the snippet.
importredefsplitSentences(sentences):"""Splits a string of sentences into a list.Each sentence is period-separated, save for cases wherethe phrases "Mr." or "Ms." are used, since thosedon't end a sentence."""sentenceList=[]# missing code herereturnsentenceList
Response
Of course! Dealing with regex can be tricky, but a "negative lookbehind" is perfect for this kind of problem. Here is the completed code.
importredefsplitSentences(sentences):""" Splits a string of sentences into a list. Each sentence is period-separated, save for cases where the phrases "Mr." or "Ms." are used, since those don't end a sentence. """# This regex pattern uses a "negative lookbehind" (?<!) to find a period# that is NOT preceded by "Mr" or "Ms". It also matches any trailing# spaces to ensure the next sentence doesn't start with a space.pattern=r'(?<!(Mr|Ms))\. *'# We split the string by the pattern, and then use a list comprehension# to filter out any empty strings that might be created if the# input text ends with a period.sentenceList=[sforsinre.split(pattern,sentences)ifs]returnsentenceList# Example Usage:text="Hello Mr. Smith. This is a test. Ms. Jones is here. This is the end."print(splitSentences(text))# Expected Output: ['Hello Mr. Smith', 'This is a test', 'Ms. Jones is here', 'This is the end']
TLDR
What the code does:
This function takes a string of text and splits it into a list of individual sentences. It correctly handles sentences that contain abbreviations like "Mr." or "Ms." without splitting them.
What fixes I made:
I used a regex pattern with a negative lookbehind to split the string on periods, except when they are preceded by "Mr" or "Ms". The code also cleans up the resulting list by removing extra whitespace and any empty strings.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-30 UTC."],[],[],null,[]]