How do you split a name to get first name and last name?

Posted on

QUESTION :

I have a column which contains a list of names, I want two other columns to contain functions which extract the first and last name. So far I have this

FirstName: =LEFT(D3,FIND(" ",D3))
LastName: =RIGHT(D3,LEN(D3)-FIND(" ",D3))

This works for names in the format “First Last”, but it doesn’t work when there is extra information such as “Mr. First Last”.

Is there a better way to go about this?

ANSWER :

There’s no foolproof way to do it, even ignoring titles and suffixes and stuff. Consider the following two names:

Edward Van Halen
David Lee Roth

The last names are “Van Halen” and “Roth”, but there’s no algorithmic way to tell the difference.

Probably best for StackOverflow, but there is no easy way in general. You can have a list of allowable prefixes and suffixes to make your algorithm better. But consider …

Dr. Jack Johnson Smith, PhD
Mr. Jim S. Van De Berg, Jr.

… splitting on just spaces is never going to get it completely right.

Also try to think about different cultures.

Just one example from Dutch: full name “Johannes Ernestus Maria van den Brink” splits up into first name “Johannes”, middle names “Ernestus Maria”, last name “van den Brink” (which should sort under B!).

Best solution (as in only 100% working) is to have separate name fields and an import method that lets the user enter the right pieces in the right fields.

So… good luck…

And it also fails for names with more first names or more last names. You really should store them separately. You should also split your input form into specific parts, like title, first name, last name. This way you can handle the possible spaces correctly.

You should extend your sheet with “first name” etc. columns, and try to convert automatically as much names as possible, then examine the results and apply corrections as needed, by hand. After this work your data will be much more easy to use and extend.

Leave a Reply

Your email address will not be published. Required fields are marked *