I can't find a way to do this...
raw_string <- "\"+001\", la bonne surprise de M. Jenn M. Ayache http://goo.gl/3EXxy6 via @MYTF1News"
clean_string <- "+001, la bonne surprise de Jenn Ayache"
desired_string <- "\"\"M. M. http://goo.gl/3EXxy6 via @MYTF1News"
I am not sure about how to call this transformation. I would say "difference" (as in set theory, opposed to "union" and "intersection").
My desired string has only and all the characters missing from the clean_string, in the good order, once for every time they appear, including spaces, punctuation and everything.
The best I managed to do isn't good enough:
> a <- paste(Reduce(setdiff, strsplit(c(raw_string, clean_string), split = " ")), collapse = " ")
> a
[1] "\"+001\", M. http://goo.gl/3EXxy6 via @MYTF1News"
Aucun commentaire:
Enregistrer un commentaire