Text Normalizer [Its so cool]

Scott_Collier-Weir · June 10, 2022, 7:59pm

Hey all!

I had a problem earlier today where I needed to normalize text into title case, or proper case, or whatever the English language calls it. Essentially:

Of mice AnD mEN → Of Mice and Men
MY NAME IS SCOTT → My Name is Scott

Here’s the current doc: Would love input if theres an easier way to accomplish it. . .Also, for some reasons, now matter what I do, the Omitted Words function will catch any word excpet for the. I think it might be a Coda bug?

Check it out and let me know what you think! Its pretty fun!

Paul_Danyliuk · June 10, 2022, 8:26pm

Here’s a more approachable formula IMO:

List("on", "or", "of", "and", "is").WithName(LowercaseWords,
List("LLC").WithName(IgnoredWords,
  thisRow.[All Caps Title].Split(" ").FormulaMap(
    CurrentValue.Lower().WithName(CurrentWordLowercase,
      SwitchIf(
        CurrentValue.In(IgnoredWords),
        CurrentValue,
        CurrentWordLowercase.In(LowercaseWords),
        CurrentWordLowercase,
        Concatenate(CurrentValue.Left(1).Upper(), CurrentWordLowercase.Slice(2))
      )
    )
  ).Join(" ").WithName(Sentence,
    Concatenate(Sentence.Left(1).Upper(), Sentence.Slice(2))
  )
))

It also does the split, and then for each word a clear centralized switchif to see if the word should be ignored, lowercased, or title-cased. Then separately I’m capitalizing the whole sentence.

In your formula, there’s a .Split(",").FormulaMap(Trim(...)) within a loop — not an optimal approach since the splitting and trimming is unnecessarily repeated for each check. It’s much better to extract the splitting outside the loop — either as a separate column on the Omitted words table that’ll be already calculated to a nice trimmed list of text values, or in a WithName declaration in the start of the formula — similarly how I did in mine but yours would be @[Omitted Words].Words.Split(",").FormulaMap(CurrentValue.Trim()).WithName(...)

Scott_Collier-Weir · June 10, 2022, 8:30pm

Love it - I was hoping you’d comment. Have you played with my solution? Any reason why my formula isn’t catching the the word whatsoever?

Paul_Danyliuk · June 10, 2022, 8:32pm

Yep — the issue with “the” is not a bug in Coda but your formula, here:

You think you’re testing whether it’s the first word that you’re currently processing — but in fact you’re testing that this is not what the sentence starts with. So if it’s the second “the” in your sentence, list.Find(CurrentValue) will still return 1 because the first word is also “the”.

To properly implement it your way, you’d have to Sequence().FormulaMap() over indices and test whether CurrentValue is 1. I thought of that, but then I thought it was easier to just “title case” everything including the first word, and then capitalize the first character of the sentence separately afterwards.

Scott_Collier-Weir · June 10, 2022, 8:35pm

Ahhhhhhh there we go - - - Ok once I find time Ill need to go in and adjust the formula

system · September 8, 2022, 8:35pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Capitalize the first letters of first and last name (Title Case / Camel Case / Capital Case / Proper Case) Tips and Hacks	15	7541	November 11, 2020
Add support to PROPER() formula suggestion Suggestion Box	6	574	May 25, 2023
Extract "The" from a company name for alphabetization	3	623	March 8, 2021
Replacing multiple words within a string of text using a formula	10	864	October 3, 2023
Not case sensitive	2	668	May 24, 2021

Text Normalizer [Its so cool]

Related topics