Tale of the Cave

Tale of the Cave

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Ha.  (Read 883 times)

Mike Caron

  • Administrator
  • *****
  • Offline Offline
  • Posts: 1243
  • Alignment: Chaotic Good
  • I do say, squee.
    • View Profile
    • WWW
Ha.
« on: August 23, 2008, 02:11:09 AM »
BIG EDIT: I wrote a tool that took the invalid character sequences (which were actually 5-8 bytes long, each), and replaced them with real characters. All posts should be good now. If you see any posts with invalid characters still, and you know what those characters are supposed to be, please let me know.



Well, to those unaware, I upgraded the forum to the latest version the other day, and all hell broke loose.

But, after that was tamed, one thing remained broken... That's right, some posts were totally blank!

... Ok, fine, maybe that's not a "that's right, " thing, but that's beside the point.

Anyway, after performing some voodoo on the database, I coaxed it to regurgitate the missing posts. Like many things regurgitated, however, they were not in their original form.

Specifically, any character invented by Microsoft Word, MSN or any other Microsoft product that isn't a real character (i.e., the curly quotes and the single character ellipsis (...)) now looks like a bunch of garbled symbols.

New posts with those fake characters in them should be A-OK from now on. But, the old ones -- gone.

The reasons for this are arcane, but:

The fake characters were implemented as higher-numbered ascii characters (those in the 128 - 255) range, which is fine in the ISO-8859-1 codepage which is default on Windows. However, at some point (probably during the upgrade), the database went stupid, and decided that it was not holding ISO-whatever-1 data, and that it was holding UTF-8 data.

Now, converting from ISO- to UTF-8 is very possible, and the forum even includes a big, shiny button to do so. But, the data is not interchangeable. Conversion is necessary. So, the forum (thinking it was dealing with UTF-8) would go along, and then puke when it saw the invalid characters. Somewhere in the process, the invalid string would be turfed, in favour of "".

The method of fixing it was easy. Just run the big conversion button that I mentioned. However, because the forum thought it was converting FROM UTF-8 (since that's what the database thought it had), TO UTF-8, but pretending it was dealing with ISO-whatever... Anyway, I hardly understand this any more than you do.

Long story short, curly quotes and their ilk are gone. Short story long, anything by Charles Dickens.
« Last Edit: August 24, 2008, 12:07:14 AM by Mike Caron »
Logged

ScienceFair

  • Full Member
  • ***
  • Offline Offline
  • Posts: 390
  • Alignment: Neutral Good
  • Worst icon ever?
    • View Profile
Re: Ha.
« Reply #1 on: August 23, 2008, 09:18:36 AM »
And now the forum is covered in digital vomit. For some reason I feel like this probably effects CA more than anyone...
Logged

Chevalier-Ange

  • Your Fallen Angel evermore...
  • Global Moderator
  • *****
  • Offline Offline
  • Posts: 1531
  • Alignment: Lawful Good
  • Through the Fire and The Flames...
    • View Profile
Re: Ha.
« Reply #2 on: August 23, 2008, 10:00:33 AM »

Yup...you're right it would seem. I generally use MS Word to write everything, then copy/paste it in the post. /sigh...so much editing, and so many posts to edit. I've noticed Whitey has quite a few as well, but then again, I know she uses MS Word as well.   :(

And a side note to Mike. Just because the forum doesn't recognize a character, doesn't mean it's a false or phony one. Any ASCII character is a real ASCII character. Even the ones MS Word makes up /sigh
Logged

 
Take me with you, we will fly across the sea
To the land of the sun where our journeys begun
All fear is gone, we sail until the dawn
Deepest fears will burn inside your mind
For the souls lost in endless time


Their story shall be told in another time...another place

Protome

  • God of Boredom
  • Full Member
  • ***
  • Offline Offline
  • Posts: 405
  • Alignment: Chaotic Neutral
  • To make room for the tuna!
    • View Profile
    • WWW
Re: Ha.
« Reply #3 on: August 23, 2008, 11:28:04 AM »
I like Charles Dickens ;_;
Logged

ScienceFair

  • Full Member
  • ***
  • Offline Offline
  • Posts: 390
  • Alignment: Neutral Good
  • Worst icon ever?
    • View Profile
Re: Ha.
« Reply #4 on: August 23, 2008, 02:23:59 PM »
And a side note to Mike. Just because the forum doesn't recognize a character, doesn't mean it's a false or phony one. Any ASCII character is a real ASCII character. Even the ones MS Word makes up /sigh

So true Mike. Just because you don't believe in something doesn't mean it isn't real! How would you like me calling you a fake?!?
Logged

interrobang

  • Jr. Member
  • **
  • Offline Offline
  • Posts: 133
  • Alignment: True Neutral
  • lightning heel of doom
    • View Profile
Re: Ha.
« Reply #5 on: August 23, 2008, 03:05:18 PM »
The font is so... big... But I'm sure I'll adjust. It's not like i have to read pages and pages of RP posts. >_> (And yeah, I tried zooming in and out with control, but it's kind of laggy when you're scrolling when you're viewing it in a smaller size.)
Logged

Mike Caron

  • Administrator
  • *****
  • Offline Offline
  • Posts: 1243
  • Alignment: Chaotic Good
  • I do say, squee.
    • View Profile
    • WWW
Re: Ha.
« Reply #6 on: August 23, 2008, 11:11:55 PM »
And a side note to Mike. Just because the forum doesn't recognize a character, doesn't mean it's a false or phony one. Any ASCII character is a real ASCII character. Even the ones MS Word makes up /sigh

Yes, in the ISO-8859-1 character encoding, character #whatever is a curly quote. However, that #whatever is NOT a valid character in UTF-8 encoding. It is encoded differently, in a way that conforms with the UTF-8 spec.
Pages: [1]   Go Up
« previous next »
 

Page created in 0.069 seconds with 18 queries.