11 December 2013

UFT8 in string literals using MySQL client over ssh

MySQL documentation ain't what it used to be; it took some experimenting to figure out how to get a utf string into my db from the command-line client interface.


The problem is that for some reason when I run mysql on my server through ssh, non-latin characters just get dropped. It might be an ssh config thing, you never know, but I explored the mysql avenue first.


Here's what I found. Suppose you want to update a field containing some non-ascii character, like "à", like this:


update events set timetable = 'de 10h à 12h' where id > 100000;

But after you paste this into your shell, the "à" is missing:


update events set timetable = 'de 10h  12h' where id > 100000;

And your data doesn't get updated the way you expect.


This is the mysql documentation on string literals: string-literals.html


And here's a handy utf8 lookup table: http://www.utf8-chartable.de/


And here's the solution:


update events set timetable = concat(_utf8'de 10h ', _utf8 0xC3A0, _utf8' 12h' where id > 100000);

To explain: _utf8 0xC3A0 gives you your "à", you concatenate that with the rest of your string and away you go. It's ugly, but it works, so stop complaining.

13 November 2013

10k downloads

Yo, my rubygems.org profile has just hit 10,000 downloads!

I'm guessing that 99% of these downloads are bots though ...

Although not entirely ... the gems at the bottom of the list are either very recent (rspec_numbering_formatter), or kinda utterly pointless (ageism). On average, older gems have more downloads. Top of the list is protopack, which is actually useful as a way to store objects as YAML for later resurrection, kind of like test fixtures, but in a production context, with liberal meta-data sprinkled on top. You would think that sepa would end up being most popular though, as it's made for a critical bit of business - preparing SEPA direct debits, using the ISO 20022 xml standard (also known as "pain.008.001", although all banking standards are a pain so in this case "pain" is banking shorthand for "payments initiation"). It's newer than protopack though, so time will tell.

One more piece of data: none of the corresponding github repositories have issues opened. There are several ways to interpret this:

  • an army of bots is dedicated to downloading my stuff and archiving for posterity it all over the internet, but they don't take any more interest beyond downloading
  • real people download my stuff, decide it's unusable, and run away fast
  • real people download my stuff, it fits their needs perfectly and is 100% defect-free, and they're using it happily

Please feel free to draw your own conclusions. Then colour them in.

03 October 2013

World Wide Web for sale (slightly used)

An EFF article reports that Tim Berners-Lee, director of W3C, the standards body that attempts to specify what web browsers do, approves the inclusion of DRM restrictions in a forthcoming HTML specification.

As the EFF article points out, this new direction is likely to damage the W3C's already less-than-stellar reputation as a respectable standards body. Which is a pity, because the web is a better place when we can all agree on interoperability standards.

Your web browser, whether on your laptop or on your phone, behind the scenes, is called a User Agent. This means it represents you, the "User", giving you much freedom in the way you experience the internet. With DRM restrictions built-in, your browser becomes a copyright-enforcement-agent, and you risk losing many of the freedoms you have enjoyed up to now. We are entering a terrifying new era where government itself is the enemy; DRM is a major piece of armour in favour of centralised power and against The People.

Disappointing, Mr. Berners-Lee. Perhaps it's time to resign and let Mr. Stallman take the helm?

28 December 2012

Book: The Talent Code, Daniel Coyle


Daniel Coyle, in The Talent Code, argues against the received concepts of "born genius", "you either have it or you don't", "overnight success". He argues that talent grows where it is nurtured, describes how this is done, and shows how the growth of myelin is the neurological foundation of talent.

Coyle proposes three pillars of talent growth: "Deep Practice", "Ignition", and "Master Coaching". The concept of "hotbeds" also runs through the book.


Myelin


Prior to understanding what myelin does, it is essential to understand that our movements are the results of signals sent to muscles by nerve cells along nerve fibres.

Myelin wraps nerve fibres so that signals sent along those fibres arrive faster and more reliably. Unwrapped fibres tend to leak their signal into the surrounding mush of our bodies, so myelin works very much like electrical insulation.

Oligodentrocyte cells do the wrapping, and they target nerve cells that fire frequently. Hence the popular wisdom that "practice makes perfect". Coyle does not challenge this piece of wisdom. Myelin improves neural circuits much the way goats carve a path across a mountain, simply by following it often.

With sufficient repetition, a performance may become automatic, like walking; under normal circumstances adults never ponder how to move the next foot forward, nor where to put it. This is the effect of myelin wrapping, and the central thesis of this book is that we can deliberately cultivate myelin in order to grow our skills.



Deep Practice


Deep practice is accomplished by breaking a difficult task into chunks, and sub-chunks as far as necessary, and then performing the chunk repeatedly, stopping and fixing all errors as they are encountered, until there are no more errors, and the chunk can be played flawlessly.

This process has the effect of repeatedly firing the nerve cells that produce the desired performance, with the consequence that their fibres get myelin-wrapped, with the result that subsequent performances are superior.

A talent hotbed has the major feature that it encourages deep practice. Brazilian footballers, for example, train by playing "futsal", a kind of mini-soccer with smaller teams and smaller rooms. The result is that each player touches the ball more often than would be the case on a standard playing field with the usual team size. This increased contact frequency provides more opportunity to practice manouevres, repeat them, and correct mistakes.


Research


Coyle cites Anders Ericsson's now-famous 10,000-hour rule and explains it in terms of how long it takes to wrap a single nerve fibre.

He mentions Juliet Barker's study of the Brontë sisters. Her research shows that the Brontë's skill didn't simply appear overnight, as many previous biographers had supposed; instead, the sisters had practiced deeply from an early age by writing stories for each other. The entire plot of Wuthering Heights, for example, is scattered among Emily's juvenalia.

Adriaan de Groot was perplexed by people of inferior skill who kept beating him in chess; he performed a study where he asked experts to memorize chess positions. Experts could easily remember positions from real games, but were no better than beginners at remembering random positions. De Groot used this research to show that chunking is a core feature of advanced skill - the chess experts were able to memorize positions by chunking the pieces, remembering relationships within each chunk, and relationships between chunks.



Ignition


Ignition is the word Coyle uses to describe the process of creating motivation. Some motivators include Group Belonging, Loss of Parent, Spartan Conditions, Birth Order, Dangerous World, Ambition, Coaching with Love.


Group Belonging


This appears to be one of the most powerful motivators as Coyle shows several examples where an entire nation comes to prominence in a certain sport due to the example of a single player. He frequently mentions Se Ri Pak, a South Korean golf player who out of the blue won a championship in 1998. Now, South Korean golfers dominate international womens' golf. Other examples include Anna Kournikova - Russian tennis (1998), and Andruw Jones - Curaçao, baseball (1996).


Loss of Parent, Dangerous World, Spartan Conditions


Coyle gives a large list of famous individuals who lost a parent early in life. Losing a parent, it seems, has the effect of motivating the child defensively to acquire skill. Constructing a dangerous-world hypothesis will have a similar effect on a child - the child is motivated to compensate for adverse outside conditions. Spartan Conditions, on the other hand, bring the "Dangerous World" right into the practice session. Coyle notes that talent hotbeds around the world share few features except Master Coaches and Spartan Conditions - run-down, ill-maintained facilities. A classical music program for children where there are not enough violins for everybody. Futsal was born because inner-city Brazil has neither the space nor the climate to maintain expensive, full-sized soccer pitches. Coyle calls this the "Scrooge Principle" - we are stingy with our personal energy resources, until external cues motivate us to break open the piggy bank and splurge on practice.


Birth Order


I'm not sure this translates into other fields of expertise: Coyle shows that the best runners are usually among the youngest siblings in their families.


Ambition


At KIPP, students are continuously reminded that they're going to college, which is a big thing for US families. KIPP takes their vague wish/dream to go to college and turns it into an ever-present, burning, and (perhaps most importantly) realistic ambition.


Coaching Love


The most important qualities of a child's first teacher in a specific discipline, are patience, rapport, likeability. If the child attaches to the teacher, they are likely to transfer that attachment to the study of their craft. Interestingly, it is not a requirement that the first coach be a top performer of the skill in question.


Research


Carol Dweck ran the "praise test" and discovered that praising effort is significantly more effective than praising results, for feeding motivation.

Gary McPherson studied the relationship between commitment, amount of practice, and externally-measured performance outcomes. He asked children to describe their commitment (number of weeks/years, or forever) to their craft, and created three groups - long-term, medium-term, and short-term commitment. Then he measured how much time on average each child would practice each week, and subdivided into low, moderate, and high levels of practice. It turns out that children with long-term commitment get far more out of each practice session regardless of their level of practice.

Geoff Cohen ran a test to discover how group identification affects motivation by showing students a story about a mathematician and then asking them to solve a maths problem. Half of the students read the story where the mathematician's birthday was altered to match their own; for the other half the birth date was unchanged. The birthday-match group spent on average 65% more time working on the problem.

Martin Eisenstadt is responsible for the parental-loss research; his population was the set of people eminent enough to merit at least a half-page entry in Encyclopedia Britannica.

John Bargh proposed the idea that the run-down, derelict feel of a talent hotbed fuels motivation.




Master Coaching


Coaching is a entire skill unto itself; a master coach must first acquire skill in the target domain, and subsequently acquire great skill as a coach. This is why the best coaches are often quite old - the skill takes decades to master.

Coaching sport is very different from coaching art. A master sports coach is mostly silent, whereas a master music coach delivers a lot of specific guidance.

A soccer player, when performing, needs to adapt quickly and skillfully to rapidly and unpredictably evolving circumstances on the field. A cellist performing Bach, on the other hand, strives to reproduce as accurately as possible the exact set of movements required to re-create the music.

Coyle argues that the neural circuitry in each of these cases in necessarily different. "Ivy-vine" describes the circuits for soccer, writing, and comedy; "Consistent", "oak-tree" describe the circuits for violin, golf, and gymnastics. Because the architecture of the circuits are different, the coaching style is necessarily different.

Coyle describes four major features of master coaches - the "Matrix", Perceptiveness, GPS reflex, and Theatrical Honesty.


The Matrix


[ I didn't understand this - the "matrix" seems to be another way of saying that a Master Coach skillfully and effectively combines all the other three features listed here ]


Perceptiveness


Perceptiveness is the ability to understand exactly what a student needs. Coaches do this first of all by investigating new students thoroughly - finding out everything there is to know about the students' backgrounds, for example. Secondly, understanding comes from continuously monitoring how the student reacts to the coach's intructions - whether those words were effective, should the coach repeat or rephrase. Armed with this understanding, the coach can then deliver personalised instruction tailored specifically for the particular situation a given student is facing at any given time.


GPS Reflex


Master Coaches deal little in reward and punishment, verbal or otherwise. Instead, most communication comes in the form of instruction, much like a GPS navigation system. A Master Coach delivers instruction consicely, clinically, in short bursts, without being dictatorial.


Theatrical Honesty


Master Coaches exaggerate emotion - surprise, delight, warmth, shock, horror.




Notes


This book was an entertaining read, with tons of anecdotes, and plenty of research. It has the potential to change the way we think about learning, with implications for education, sport, art, business, and also how to deal with age-related diseases targeting the brain.

Myelin depends on a supply of Omega-3 fatty acids, so make sure you eat plenty of oily fish, or olive oil, or nuts, or all three. The wrapping process itself is slow, but dependable for healthy, young, well-nourished individuals.

Coyle might have usefully brought up NLP theories of motivation - according to NLP, people are either motivated "away" from an undesirable situation, or "towards" a desirable one. It might be useful to consider how "Ambition", "Group Belonging", "Birth Order" might be instances of "towards" motivation, and "Parental Loss", "Spartan Conditions", and "Dangerous World" might be instances of "away" motivation.

I find the "ivy-vine" vs "oak-tree" neural circuit metaphors dubious, and Coyle cites no research to support this. Coyle does mention how team sports are effectively self-teaching, as players get instant reward/punishment feedback in the form of keeping or losing the ball, scoring a goal or failing to stop one, winning the game or losing it. An artistic performance on the other hand implies an external judge. It's possible that coaching style is related only to the way in which the student's performance is judged.

It's worth noting that computers excel at some kinds of games because they can self-teach. The world's best backgammon player, for example, is a self-taught computer progam, and my telephone beats me easily at Go. There aren't any computers producing renaissance masterpieces. Even Emily Howell, a music-writing computer program by David Cope, needs a lot of feedback and guidance to produce quality music.


16 October 2012

One Click Only

Normal people, quite reasonably, expect buttons and links and other clicky thingies on a web page to work the way they work in other software. For a generation brought up on windows, this means double-clicking.

Alas, double clicking "Buy Now" on a web page might well mean "buy now, and buy now again" - your client doesn't want that, and you don't want the resulting credit card chargebacks and reputation hit.

Enter one_click_only, a teensy weensy jQuery plugin to prevent stuff from being double-clicked. This plugin works by calling event.preventDefault(), so it's not going to do the trick if you have custom javascript for submitting forms. You might need to consider something like event.stopPropagation() for that. But steal some ideas from this, you will figure something out.

This script works by storing the last click time in the DOM node's data. The first click goes through; subsequent clicks are allowed through if they are more than a threshold number of milliseconds later (here, 500ms). To prevent any subsequent clicking, just check if the previous_click is not null, instead of comparing it with now



Use thusly:



Enjoy.

15 February 2012

Mixing Landscape and Portrait rendering in a Wicked PDF document

Wicked is an awesome ruby library for generating PDF documents from plain old HTML/CSS.

HTML and CSS are not just another language; they provide a model for representing a document, and if you were obliged to use something else you would eventually end up representing your documents using this model, even if your syntax layer differed from HTML (if you're really smart, you would have ended up with HAML/SASS, for example; same model, different prettiness).

This is why other libraries fail (where "fail" means "I don't like them") - they oblige you to learn a whole new model for representing documents.

Anyway, the point is that you might believe from the Wicked README that you can create a document in landscape mode, or in portrait mode, but you're out of luck if you want both in the same doc.

It turns out you're not out of luck; Wicked, as its name non-obviously suggests, ultimately relies on WebKit (via wkhtmltopdf) to render html pages. With WebKit, you have access to a whole bunch of modern CSS properties, including those that rotate your document. You don't even care that they're WebKit specific, because you don't have to care about cross-browser support: you're using a known webkit version running on your own server which you control.

Here's the CSS:

.page {
  width:            195mm
  height:           270mm
  page-break-after: always
  overflow:         hidden
}

.page .landscape {
  position:                  relative
  margin:                    270mm 0
  width:                     270mm
  height:                    195mm
  -webkit-transform:         rotate(-90deg)
  -webkit-transform-origin:  0mm 0mm
}

The .page rules simply define an A4 page (after margins), and guarantee a page-break at the end of each page, just in case your printer didn't understand. The CSS assumes that you print in portrait by default. When you want landscape, nest a <landscape> element inside your <page>. Here's an example:

<page>
  This is the first page. It gets printed in portrait mode
</page>

<page>
  <landscape>
    This is the second page. It gets printed in landscape mode.
    You will have to twist your head to read it.
  </landscape>
</page>

Not so bad, no? Good luck!

06 February 2012

MySQL "LOAD DATA INFILE" with no primary key

I'm migrating a whole bunch of legacy data into mysql for a client. I export from their system, use sed to patch the nonconforming unparseable CSV it outputs, convert from macroman to UTF8, then tell MySQL to load it all into a series of tables from which the data will later be integrated into the standard system.

Today I realised the error of my ways: the column that I was using as a primary key isn't a primary key at all (at least not in the modern sense - of having a unique value - that we've become accustomed to since Codd invented the relational model in 1969).

I could have realised this earlier by just declaring it a primary key and watching MySQL spit at me, and therefore have fixed the problem sooner ... but no, who needs tests and constraints when assumptions are so much faster?

Enough about me ... the problem now is to assign each row in the exported data a unique id, and the easiest, obviousest way to do that is to include an id integer primary key auto_increment column in each table definition, let LOAD DATA INFILE etc etc shove the data into the table, and the primary key takes care of itself.

But it wasn't obvious how to make this work without warnings about mismatched columns. Here's the clever bit: what I did in the end was to add the id column after loading the data. In other words, I imported the data into a table with no id primary key column; and added that column afterwards. It's easy to script, it does exactly what I want, and it produces no spurious warnings. Perfecto!

Here's the concept in code:

> create table legacy_stuff(leg_col_1 varchar(255), leg_col_2 varchar(255), et_cetera_1 varchar(255), et_cetera_2 varchar(255));

> load data infile 'legacy_stuff.csv' into table legacy_stuff; -- plus all your favourite options;

> show warnings;

> alter table legacy_stuff add column id integer primary key auto_increment;

Good luck...

Update: for another way to look at this, see the stackoverflow page on this topic. It turns out you can use load data infile with the list of columns you want to import to, in which case MySQL will match the columns of your CSV to your specified column list. This way you can let your auto-increment primary key column simply manage itself. This solution does not suit my particular situation as I'm importing tables with hundreds of columns (they didn't believe in normalisation where this data comes from), and I don't want to have to maintain the column list in multiple places.