thanks for looking. I'm still working on my named entity recognition project, and I'm almost done. My project was to extract all the names of people from a long string, and I've gotten to the point where I have a list of names, which I have named ent3.
This list has some artifacts from previous processing that are incorrect. Specifically, I have elements in the list like 'Josie husband' or 'Laura fingernail'. I want to eliminate those elements completely.
Is there a way to make Python iterate over the list and remove any elements that contain an UNcapitalized word?
In my article titles, I use CultureInfo.CurrentCulture.TextInfo.ToTitleCase(str.ToLower());
but I think, it is not working after double quotes. At least for Turkish.
For example, an article's title like this:
KİRA PARASININ ÖDENMEMESİ NEDENİYLE YAPILAN "İLAMSIZ TAHLİYE" TAKİPLERİNDE "TAKİP TALEBİ"NİN İÇERİĞİ.
After using the method like this:
private static string TitleCase(this string str){ return CultureInfo.CurrentCulture.TextInfo.ToTitleCase(str.ToLower());}
var art_title = textbox1.Text.TitleCase();
It returns
Kira Parasının Ödenmemesi Nedeniyle Yapılan "İlamsız Tahliye" Takiplerinde "Takip Talebi"Nin İçeriği.
The problem is here. Because it must be like this:
... "Takip Talebi"nin ...
but it is like this:
... "Takip Talebi"Nin ...
What's more, in the MS Word, when I click "Start a Word Initial Expense," it's transforming like that
... "Takip Talebi"Nin ...
But it is absolutely wrong. How can I fix this problem?
EDIT: Firstly I cut the sentence from the blanks and obtained the words. If a word includes double quote, it would get a lowercase string until the first space after the second double quote. Here is the idea:
private static string _TitleCase(this string str){ return CultureInfo.CurrentCulture.TextInfo.ToTitleCase(str.ToLower());}public static string TitleCase(this string str){ var words = str.Split(' '); string sentence = null; var i = 1; foreach (var word in words) { var space = i < words.Length ? " " : null; if (word.Contains("\"")) { // After every second quotes, it would get a lowercase string until the first space after the second double quote... But how? } else sentence += word._TitleCase() + space; i++; } return sentence?.Trim();}
Edit - 2 After 3 Hours: After 9 hours, I found a way to solve the problem. I believe that it is absolutely not scientific. Please don't condemn me for this. If the whole problem is double quotes, I replace it with a number that I think it is unique or an unused letter in Turkish, like alpha, beta, omega etc. before sending it to the ToTitleCase
. In this case, the ToTitleCase
realizes the title transformation without any problems. Then I replace number or unused letter with double quotes in return time. So the purpose is realized. Please share it in here if you have a programmatic or scientific solution.
Here is my non-programmatic solution:
public static string TitleCase(this string str){ str = str.Replace("\"", "9900099"); str = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(str.ToLower()); return str.Replace("9900099", "\"").Trim();}
var art_title = textbox1.Text.TitleCase();
And the result:
Kira Parasının Ödenmemesi Nedeniyle Yapılan "İlamsız Tahliye" Takiplerinde "Takip Talebi"nin İçeriği
I have a plain text document I process with Visual Studio Code, with about 1,000 lines.
Each line contains a sentence that starts with an English letter.
I desire to uppercase every first English letter if it isn't uppercased already, with regex.
Search (match):
^[a-z]*
Replace with:
[A-Z]*
Result:
A-Z*U+0020 sentence
A-Z* sentence
To clarify, I got:
A-Z*
+ U+0020
(a whitespace character), at the start of each line, in all about 1,000 lines.
How could I uppercase every first English letter that isn't uppercased already, with regex?
In javascript, I need to capitalize each first letter of words in a string (proper name) but not when before an apostrophe such as this example :
from henri d'oriona --> Henry d'Oriona
All I can get is something like Henry D'oriona or best case Henry D'Oriona.
Thanks for your help
Jacques
I'd like to ask you if there is a reason to capitalize all items in menus, etc in application user interface, for example
Why shouldn't I just label these items as File->Page setup etc.? This kind of capitalization just seems wrong to me - but I am not a native English speaker, so I just might not dig it.
Please note that by viewing our site you agree to our use of cookies (see Privacy for details). You will only see this message once.