Saturday, 21 January 2017

One Nation, Divided By A Common Language

quote [ Thanks to the internet, the American language seems to be exploding at an almost unclockable pace. Polyglot social media has exposed American English, with its historically promiscuous embrace of new idioms, to more digital pidgins, foreign words, microdialects, pictograms, neologisms, and cryptic symbols than any one language user can ever expect to brook. ]

This is mostly a test of the "prep post" system. I hope you find it interesting, anyway.
[SFW] [science & technology] [+4 Interesting]
[by midden@3:39pmGMT]

Comments

lilmookieesquire said[1] @ 7:42pm GMT on 21st Jan [Score:1 Interesting]
While I'm sure the internet has done a lot to connect people and change langue in and of itself, I'm not sure that this "surge" isn't just that language is being written down in a way that can be analyzed by computers.

A nice example of this is the programming language R that has a function (a short program compiled by nerd) that can scan for "emotionally associated words" (which has culture issue and sexist biases).

For my project, based off a Stanford dataset for a longer and better paper, I analyzed the /r/randomactsofpizza (araop or raop- I can't remember) from the years 2012-2015 and tried to sort out differences between successful and unsuccessful requests- including attitudes/upvotes/community involvement on all of Reddit/community involvement in that particular subreddit.

Given that English is such a dynamic language I'd go out on a limb and say that this is the fault of science "journalism" vs a new phenomena (but you can't really measure it compared to the past because the data isn't really there written down in a manner that can be easily accessed.
steele said @ 8:25pm GMT on 21st Jan
Find anything interesting?
lilmookieesquire said[2] @ 8:48pm GMT on 21st Jan [Score:1 Interesting]
Nothing too wonderful. I had issues with data size and cleaning data that made meaningful results within my time and scope limits difficult to get to (the Stanford paper which is better was a graduate thesis so they had a good chunk of time, vs I had like two or three weeks and very limited space which included picking and understanding the data set) but being positive was generally slightly more beneficial vs the words "not" or "don't" being more associated with failed requests. (The word truncation function I was using was also imperfect)

Obviously upvotes correlates with successful requests but that would have been an entirely different paper - but was a "with more funding..."; "future research" recommendation.

The next thing I would have investigated was the time of the request.

The real difficulty was that most of the accounts were a first comment in that sub account.

My paper was for a data mining and analysis class so the paper was more of a demonstration of techniques and cleaning the data took so. Much. Time.

In my defense my professor did one on mortgages and it took her about a year to clean the data and make it useable. The nice thing is that even those tools are advancing at great speed.

SPSS (for example) is so so much easier to use now than it was in 2008. Everything is getting more automated but the tricky part is understanding the techniques and assumptions in that automation (which is something that lead to the financial meltdown with AAA rated bonds containing huge amounts of risk via inability of clients to pay back the loans.

So there's real danger in just being a button pusher without understanding the mechanics- but we're all guilty of that to some degree.
steele said[1] @ 8:59pm GMT on 21st Jan [Score:1 Interesting]
Have you seen Syntaxnet's Parsey McParseface? I'm hoping to conquer that this summer. I'm thinking of utilizing a cross between that and mturk workers.

If the chatbot neural net finishes without issue I'm hoping to dump the SE comment database and see if I can build us a bot. Even better if I can find a working copy of the old site's comments.
lilmookieesquire said @ 10:54pm GMT on 21st Jan [Score:1 Hot Pr0n]
lilmookieesquire said[2] @ 8:53pm GMT on 21st Jan [Score:1 Interesting]
Here's the gist of their paper:

Long ass link

The gist of what they found:
And the researchers found that family, money, and job narratives improved the likelihood of pizza fulfillment, while the other categories were neutral or reduced the chance of success. Politeness (specifically expressions of gratitude), evidentiality (like including a photo), reciprocity (an indication that the poster would potentially give pizza to someone else at another time), sentiment, and length were other factors contributing to success.

Post a comment
[note: if you are replying to a specific comment, then click the reply link on that comment instead]

You must be logged in to comment on posts.



Posts of Import
Karma
SE v2 Closed BETA
First Post
Subscriptions and Things

Karma Rankings
ScoobySnacks
HoZay
Paracetamol
lilmookieesquire
Ankylosaur