FJ Software Foren-Übersicht  
 Homepage   •  Suchen   •  FAQ   •  Mitgliederliste   •  Registrieren   •  Login
 Duplicate SMS messages not seen as duplicates Nächstes Thema anzeigen
Vorheriges Thema anzeigen
Neues Thema eröffnenNeue Antwort erstellen
Autor Nachricht
vodoomoth



Anmeldedatum: 13.01.2012
Beiträge: 66

BeitragVerfasst am: Di Feb 02, 2016 19:57 Antworten mit ZitatNach oben

I am facing a quite weird problem with duplicate SMS messages in the Archived messages.

I have literally hundreds of pairs of seemingly identical messages (which I know have somehow been duplicated in the MPE database – I received or sent only one copy of the message). The "Search duplicates" menu entry runs without finding anything... but I still see strictly identical sender, message and time entries for those duplicates.

I imagine that the comparison is based on more than "From/To", Message and Time, which is strange to me as I, as a user, don't see these additional properties that come into play in the comparison.

Is there a further way to get rid of those double entries, in addition to the "Search duplicates" command, which, in all obviousness, fails to detect these many duplications.

@FJ: I can send either logs or screen captures if needed.

I'm running the latest versions of both the PC (1.8.7) and phone (1.0.39) clients.
Thanks.
Benutzer-Profile anzeigenPrivate Nachricht senden
ealbamb



Anmeldedatum: 17.05.2015
Beiträge: 4

BeitragVerfasst am: Do Feb 04, 2016 08:42 Antworten mit ZitatNach oben

Hi,

I'm with 1.8.7 and having the duplicates issue, however I'm one step behind .. I cannot find the Search Duplicates. Can you please give a hint (My Db counts now more than 15000 entries). In other discussions I read about Right-Click but cannot understand where I have to

Thank you all

_________________
bert
Benutzer-Profile anzeigenPrivate Nachricht senden
vodoomoth



Anmeldedatum: 13.01.2012
Beiträge: 66

BeitragVerfasst am: Do Feb 04, 2016 21:09 Antworten mit ZitatNach oben

Search duplicates appear in the contextual menu (right mouse click – for lefties :-)) when you are in "Archive (computer)" (see the "Messages" section of the left side bar).
Therefore, duplicates will be removed from the archives. If you have many of them in your Sent or Inbox folders, you may move them to the archive, remove duplicates and move them back to their original folders.
Benutzer-Profile anzeigenPrivate Nachricht senden
ealbamb



Anmeldedatum: 17.05.2015
Beiträge: 4

BeitragVerfasst am: So Feb 07, 2016 10:16 Antworten mit ZitatNach oben

ok. Found and used. Removed 133000 duplicates from archive .. however it still creates new every time I sync, but this is another story. Thanks a lot Vodoomoth.

After a check I found a small number of residual duplications but very few. I guess there is something different in the actual text but I can't get what it is

Thanks again

_________________
bert
Benutzer-Profile anzeigenPrivate Nachricht senden
ealbamb



Anmeldedatum: 17.05.2015
Beiträge: 4

BeitragVerfasst am: So Feb 07, 2016 10:37 Antworten mit ZitatNach oben

.. after a deeper check I found that in duplications left after removal, text is different.

The difference is caused by some sort of automatic formatting of special characters (apostrophe changed to diferrent apostrophe type, short dash changed to long dash, suspension bullets changed to underscore ... ). In my case they are very few (17 out of 16000 SMS and after a dedup which remved 133000 duplicates). I use Italian as a language (may be this interferes ..)

Hope this helps

Thanks

_________________
bert
Benutzer-Profile anzeigenPrivate Nachricht senden
vodoomoth



Anmeldedatum: 13.01.2012
Beiträge: 66

BeitragVerfasst am: So Feb 07, 2016 12:47 Antworten mit ZitatNach oben

ealbamb hat Folgendes geschrieben:
.. after a deeper check I found that in duplications left after removal, text is different.

The difference is caused by some sort of automatic formatting of special characters (apostrophe changed to diferrent apostrophe type, short dash changed to long dash, suspension bullets changed to underscore ... ).


This may be an explanation of some (or maybe most) of the duplicates. But I have such messages that have no special characters; they are pure ASCII. Example of the contents of one such message:
Zitat:
Courage!


8 characters, with none that could have been replaced.

Anyway, I believe that what the user sees in the Archive view should overrule technical considerations. Same From/To, Message and Time fields in different messages must lead to a classification as duplicates.

I have uploaded screen captures of two duplicates with the 8-character contents that I've mentioned above:

Image

Image


The only difference is in the PDU field (two more characters in the second image, and FF09 at the end of the first line instead of FF08), which I, as a user, don't care about. I don't even know what it stands for.

I think the PDU field is a better candidate for an explanation of why MPE doesn't always see the duplicates that users see.

@FJ: could you add an option (checkbox or additional menu entry or else) to ignore that PDU field when determining whether two are duplicates? Maybe a "deep-search for duplicates" option is in order.
Benutzer-Profile anzeigenPrivate Nachricht senden
tubular



Anmeldedatum: 08.06.2015
Beiträge: 2

BeitragVerfasst am: Fr März 04, 2016 13:25 Antworten mit ZitatNach oben

Anything new in this topic?
I have the same problem...
Benutzer-Profile anzeigenPrivate Nachricht senden
vodoomoth



Anmeldedatum: 13.01.2012
Beiträge: 66

BeitragVerfasst am: Di Mai 10, 2016 15:43 Antworten mit ZitatNach oben

I have just come across two identical messages that seem to have the exact same PDU but which still aren't detected as duplicates. Therefore, unless I've missed something in that very long PDU (or it's too long to fit in the available space in the dialog box), the PDU field still isn't enough to explain the fact that MPE misses some duplicates.

FJ, have you got a chance to look into this?
Benutzer-Profile anzeigenPrivate Nachricht senden
vodoomoth



Anmeldedatum: 13.01.2012
Beiträge: 66

BeitragVerfasst am: So Aug 07, 2016 11:53 Antworten mit ZitatNach oben

I have ended up finding out why this problem occurs: one version of the duplicate messages has a trailing whitespace character, and/or the time of the message is affected by daylight saving settings (meaning that the same message will appear with two different timestamps).

I found out about the trailing whitespace because I've written a small Java program to process an MPE export of messages into a text file so as to remove duplicates. It reads the exported message file and copies messages to either of two files, one with the "pristine" conversation(s) free of any duplicates, and the other that contains the duplicates. I haven't dealt with duplicates caused by daylight saving settings.

I haven't tried yet exporting all archive messages, deleting messages from the archive and reimporting the cleaned up export. I guess there should be no problem doing this.
Benutzer-Profile anzeigenPrivate Nachricht senden
xanda



Anmeldedatum: 16.05.2013
Beiträge: 13

BeitragVerfasst am: Sa Okt 28, 2017 15:31 Antworten mit ZitatNach oben

We can see the issue too: having exported to CSV, the trailing whitespace are clearly shown when looked at with a spreadsheet.

Is there a way to handle this automatically?

We have 1000s of messages and reckon about a third are duplicates. It's very tedious trying to delete them all manually.

Any suggestions? Thanks.
Benutzer-Profile anzeigenPrivate Nachricht senden
Beiträge der letzten Zeit anzeigen:      
Neues Thema eröffnenNeue Antwort erstellen


 Gehe zu:   



Nächstes Thema anzeigen
Vorheriges Thema anzeigen
Du kannst keine Beiträge in dieses Forum schreiben.
Du kannst auf Beiträge in diesem Forum nicht antworten.
Du kannst deine Beiträge in diesem Forum nicht bearbeiten.
Du kannst deine Beiträge in diesem Forum nicht löschen.
Du kannst an Umfragen in diesem Forum nicht mitmachen.

Powered by phpBB © 2001, 2002 phpBB Group :: FI Theme :: Alle Zeiten sind GMT + 1 Stunde
Deutsche Übersetzung von phpBB.de