A Blog recording the life a Web Scientist
Summary of International Conference on Weblogs and Social Media (ICWSM) 2012
The ICWSM2012 was held in Dublin, Ireland on the 4-7th June 2012. It was a multi-part conference, with the first day dedicated to half and full day workshops, ranging from the impact of social media on journalism (http://www.arcomem.eu/icwsm-2012-workshop/), the use of large scale data mining with social media (http://www.ramss.ws/), to examining new and exciting ways to visualise the ever growing network of social media data (http://socmedvis.ucd.ie/). The remaining days (5-7th) were for the main conference event, with a single track schedule that ensured that all papers and posters could be attended. Before discussing the main conference, let’s spend a moment on discussing the Social Media Visualisation workshop.
SocMedVis (as it’s liked to be called) was opened by Ben Shneiderman, who provided a great start to the workshop. His keynote explored the challenges to visualising the ever growing pool of data that social networking sites are generating and gracefully reflected on his highly cited phrase: “Overview, Zoom and Filter, Details on Demand”. A really important message Ben gave was the practicality of creating visualisations of networks – they are to allow users to thing, not to paint and view a picture. The network needs to be functional and allow the user to perform tasks, and as he stated, it should do this in three ways, providing an overview of the entire network (the macro), Zooming in and filtering on specific parts of the network (micro), and then providing details on this when required. Having a pretty network is only benefit, not essential. Interestingly, this really has some strong ties with the research I’ve been doing on creating a methodology for understanding Web Activity, and how one need to be able to examine both the micro and macro, with detail when required. I digress, but this has definitely left me thinking about content and context of networks. Towards the end of Ben’s talk, he introduced a new way (his PhD student is working on) to visualise networks, reducing clutter and complexity using a concept known as glyphs – in essence, it replaces fans (the attached nodes of a high in-degree node) with arches, which are of different sizes based upon the number of attached nodes within the fan. There are also other concepts such as bridges for multi-connected nodes, but the general idea is to provide a cleaner way to understand a network diagram – something to look out for indeed!
After Ben’s keynote, a coffee break was given, during which the morning poster session commenced. Despite it being only a workshop day, attendance was great, and the poster that I was presenting on visualising Twitter networks using a classification model was well received, questions and advice given, and plenty of discussions providing ideas to take away and develop further. One particular discussion left me thinking about not only the cascades of retweets within a network (and using this as a way to identify influential individuals), but also the use of entities within tweets – URLs, extra hashtags, photo’s – to create cascades of Tweets.
After the morning coffee break, 3 papers were given (4 was listed, but one of the presenters couldn’t make it), offering a range of exiting research focusing on Images, Blogs and Stream. The 3 selected papers offered some great research, explore different ways of understanding cultural differences within Instagram (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4782/), Examining the posting behaviour of bloggers and how they use social media (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4741/) and finally the introduction of a new, large scale research project which aims to develop a new method and set of tools to analyse information in blogs (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4740/). Some really interesting things were taken away from these three presentations, firstly, the ability to spot cultural differences based on the colours of Instagram photo’s, and also the openness and usability of Instagram’s API, allowing for mass collection of data. Secondly, the relationship between people that blog and their use of Twitter – 82% of bloggers use and advertise on Twitter, and the times that people write blogs – anytime of the day – compared to the use of Twitter, which is predominately during the day. Finally, it was great to see research that is pushing the multidisciplinarity angle, being aware that visualising networks can only provide so much information; it’s about the context as well.
After lunch, the Workshop resumed (with even more attendees than before), with an Applications panel discussing visualisation of data from a multi-disciplinary perspective, a great way to follow the final presentation of the morning. This raised some interesting topics regarding the use of social media visualisation in the social sciences, how it can be used effectively and efficiently. This draws upon some of the research areas I’ve been investigating, specifically the use of Big Data (and visualising it) within the social sciences. How we this data be used efficiently to gain a better understanding of social processes, structures etc. These types of questions really are really probing at the fundamental capabilities of the disciplines in question, but are worth asking.
Another coffee break followed this panel, providing another round of questions, networking and discussions, fuelled by the thought provoking debates regarding the use of visualisation across disciplines. The second and final paper session was focused towards microblogs, from examining how large amounts of data can be distilled and visualised simply (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4785/) to examining how political opinions and stance can be identified through the use of social media (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4774/). Again, this panel showcased some great working applications and research ideas for interpreting and visualising large streams of microblogging data (Twitter specifically), and also opened up a number of research areas where my work on Visualising Twitter conversations can be taken, specifically the use of Web-based visualisation. Overall, it was a great workshop, and the community that attended, the papers presented, and the discussions had, are promising for next year’s workshop.
The main conference track started on the 5th June, opened with a keynote from Google+ engineering director, Andrew Tomkins. His talk discussed the fundamentals of social networks why they exist and how we need to engineer online platforms more efficiently to get the most out of them. His presentation led to discussing how social networks are formed so that human can perform and complete tasks efficiently (this raised some debate on the Twitter stream), and how social networking platforms need to harness this; the next step in social networking is social task completion, especially as Web users spend 1/3 of their time using social networking sites or communicating with each other. Interestingly, his talk is very close to the concept of social computing/machines, but instead of harnessing the power of humans to complete computationally difficult tasks, the next step in social networking is harnessing what is already on the Web – and by that definition, what users already do – to turn it into a more efficient and task orientated machine. Taking the content already out there and the activities currently performed and making them social, i.e. retail, banking, etc. Andrews talk offered a great vision on the future direction of social networking, suggesting that such platforms need to now embrace much more than just offering communications and networking between friends, they need to make everything (on the Web) social.
The first paper presentation session focused on Privacy and security, starting a paper (which was a runner for the best paper award) on the motivations and truth behind the use of TripAdvisor (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4675/). This not only discussed the J-shape distribution of product reviews, but also discussed the current processes used to determine fake reviews – by hand, only 62% of the fake reviews can be found. An interesting finding of this paper was that single time reviewers tended to be more extreme and opinionated than multi-time reviewers, is this to do with the fact that the latter users are more worried about their social presence in the reviewing community? Shifting focus towards privacy, Stutzman’s paper on privacy on social networking sites and its relevance to social capital (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4666/) also proved to be interesting, finding that social capital doesn’t really play a prominent role in determining ones privacy concerns or settings, which tied nicely into Page’s presentation on boundary preservation and the use of Google+’s circle feature (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4679/); users were more aware of their use and control over their circles, which was closely related to their desire to preserve their social boundaries related to their offline circles. Interesting, as Stutzman’s research suggests that social capital doesn’t affect privacy (which implicitly is affecting the network that they are in), yet individuals aim to preserve their circles of users based on the pre-existing offline boundaries. The final presentation of the session fit well with the on-going discussion on privacy, examining the fine line of disclosure and concealment of social media users (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4613/). Using Item Response Theory (IRT) as an analytical framework, a study was performed to examine the disclosure of Facebook user’s metadata, such as political position, gender, age, etc. From this, two interesting points were raised, men were more willing to share information (beyond their inner circle) and also geographic and work-related information was highly-valued, and was less likely to be shared in a public setting.
Following this session, the lightening presentations were given, consisting of 9, 1 minute presentations supported by a single slide (or a copy of their poster), all around the topic of diffusion & propagation, topics and sentiment analysis. This is a presentation format that I hadn’t seen before and actually offers a really good way for the audience to engage with the presenters, as after they present, they get to stand by their posters take questions.
The afternoon presentations shifted focus towards user profiling and grouping, again starting with a best paper candidate – examining how social media can be used to examine different characteristics of a city (www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4682/). A highlight of this session, and a paper close to my research area was presented by Sharad Goel for Yahoo! Research, examining the activities of users on the Web and their browsing behaviour (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4660/), providing some verification and validation to the work that I’ve been doing examining the structure of the Web (albeit from a quantitative perspective). In addition to this, the research also pointed out that the browsing behaviour of a user can be used to identify specific details, such as ethnicity, financial status, etc. Also, interestingly, a user’s educational background determined the type of browsing and activities that they perform; in effect providing evidence to support the argument of the digital divide.
Another paper that stood out (It was a best paper candidate too) was Adam Sadilek et al. research on modelling the spread of disease using social networks and the social interactions between users (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4660/). Using a large dataset, their framework for tracking disease demonstrated that increased interactions between users who have infectious diseases were more likely to catch it, and co-located interactions (not necessarily with the infected user) increases the risk of infection as well. With a growing trend in tacking real-world issues, the use of information diffusion in studies such as this shows the real benefit and impact of social media (and the researchers)!
The second day of the conference was opened by a keynote by Lada Adamic, with the title of “the information life of social”. This was a real contrast to the previous days keynote by Andrew Tomkins, as it focused on social networks as a platform for sharing information. Some really interesting figures were presented, including incentives to share based on your friends – you are 7.3 times more likely to share something if your friend shares it. Another interesting Figure raised is the likelihood of sharing something at all – which has a 0.26% probability – this seems low, but when you scale it up, if this was even at 1%, then the information overload would cause an epidemic of shared content, social processes actually act as an effective sharing filter. Lada’s keynote then discussed the diffusion of meme’s, and showed that the rate of diffusion was similar to Yule’s process of evolution – how organisms mutate. Lada also showed that certain memes, based on their subject or content spread faster and more efficiently than others, the use of a qualitative study of why would have been a nice addition to this to add more content to why this was the case. Overall, the keynote offered a great insight into the diffusion of shared content, and discussed some timely topics such as the spreading of meme’s across social networks.
The rest of the morning was the same format as before, first with a paper presentation round focusing on sentiment and emotion, followed by a number of quick fire presentations related to geographical research topics. A paper which was of real interest to me and also possibly research within Southampton’s Web and Internet Science research lab was a paper given by Yelena Mejova, who was looking at the analysis sentiment across different social media streams (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4580/). The research showed that using a model that is trained on a diverse dataset offers a generalizable solution to examine sentiment across different social media platforms. Examining three different types of social media – reviews, Twitter, Blogs – it turns out that reviews, followed by Twitter, and then finally blogs provide the best way to train a sentiment classifier. However, the implications of this research reach beyond just the training of sentiment classifiers, it is suggesting that there are similarities between social media platforms, even though they are diverse and used for different purposes. In regards to the Web Observatory project (http://www.w3.org/community/webobservatory/), knowing that cross sentiment analysis of multiple social media platforms is possible provides a useful reference for current the current being conducted – examining the diffusion of information across a variety of social media platforms such as Microblogs and Wikis.
The afternoon of second day was filled with a number of industry led sessions, opening with Igor Perisic keynote, a senior director of engineer for LinkedIn, who discussed the dynamics of social networking in terms of the Job ecosystem, followed by two industry panels, news and business. These were focused towards the application and benefit of social media, but also supplied some interesting facts and figures including: over 370 million tweets are produced per day and 12 of the top 25 social news providers didn’t exist 10 years ago; clearly an example of a fast paced, rapidly changing community. As with any research interested in the cutting edge, keeping up with the latest methods to collect, analyse and present findings will no doubt be a challenge, yet, it is these challenges that make it so rewarding. The evening of the 6th was reserved for the conference welcome event, which took place in Dublin’s famous Guinness factory, a great place to network, discuss future research opportunities and relax with the community of social media enthusiasts.
The final day of the conference began with Fabrizio Sestina’s keynote on the new EU funded project on Collective Awareness Platforms, designed to help foster a sustainable environment and improve social innovation. The topics he discussed- Internet Science, collective actions, technology design, social policies, legal frameworks – were very similar to the goals of Web Science, especially with the call for multidisciplinary projects (which was actually a requirement to get funding). This will be an interesting project to watch, especially in terms of the overlap with Web Science research, hopefully in the future, collaboration between the Web Science research labs and this project will be possible.
A paper which stood out was that of Duc Minh Luu et al. (http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4625/), who discussed the diffusion of items across social networks. Their work, which was discussed with me during the poster presentations earlier on during the week, examined the spreading of items within social networks, introducing and explaining their use of the Bass Model to examine the diffusion of entities within a network. Interestingly, their work on the temporal diffusion of content over time showed similarities with the research that I have been doing in regards to the dynamic changes in network structure of communications between users. Both Doc Minh Lee’s and my findings show the average degree distribution of a network increases overtime, interesting that this is the same for a communications (mentions) network as the diffusion of items (i.e. URLs, images, video).
Overall the conference was a great success, offering a diverse set of research papers, keynotes and panel sessions. The workshops were a great way to begin the week, providing the opportunity to interactive and network with others in the same research community. The single track format of the conference ensured that all presentations could be attended, and the lightening sessions offered a much more fluid and engaging way to discuss presenter’s research. Let ICWSM2013 be as much of a success as 2012.