More Fun with Regular Expressions : Word and Paragraph Parsing

Trolling the ASP.NET forums again this morning, I know I do it a lot, I found a question trying to parse the paragraphs out of a series of text. So I knew I had to answer it. The regular expression needed is '(.+)'. This tells the Regular Expression object to match on a series of one or more word related characters. This means it will group matches for a paragraph, indicated by a line or carriage return. Code for this solution would look like this:

public

static

MatchCollection GetParagraphs(){

using

(StreamReader sr =

new

StreamReader(

@"{Path To Sampel File}\SampleText.txt"

)) {

string

textFromFile = sr.ReadToEnd(); Regex rg =

new

Regex(

@"(.+)"

);

return

rg.Matches(textFromFile); }}

I thought I would extend this to get a word count as well as all the words. In this case the expression is '(\w+)'.

public

static

MatchCollection GetWords(){

using

(StreamReader sr =

new

StreamReader(

@"{Path To Sampel File}\SampleText.txt"

)) {

string

textFromFile = sr.ReadToEnd(); Regex rg =

new

Regex(

@"(\w+)"

);

return

rg.Matches(textFromFile); }}

Calling the RegEx.Matches method returns a MatchCollection, which has a Count property, can be used to get the count of matches. It can also be enumerated through to get that actual matches.

public

static

void

WriteMatchCollectionResults(MatchCollection mc){ Console.WriteLine(mc.Count);

foreach

(Match m

in

mc) { Console.WriteLine(m.Value); } Console.WriteLine(

"..........................................."

); Console.WriteLine(

""

);}

Share This Article With Your Friends!

We use cookies to give you the best experience possible. By continuing, we'll assume you're cool with our cookie policy.

Install Love2Dev for quick, easy access from your homescreen or start menu.

Googles Ads Facebook Pixel Bing Pixel LinkedIn Pixel