WPMu Development for Education

Making WPMU work in education, one hack at a time

Archive for the 'Data' Category

Setbacks

Posted by Instructional Technology on 27th October 2009

Some of you may have noticed that your site is missing posts. Others of you may notice that your blog is completely gone. Both are the result of a very unfortunate accident that occurred when the Systems Group tried to restore the Voices site to its state at 6 AM this morning.

As you know we were upgrading the software this morning. This afternoon it was discovered by Dr. Hayward that some users could not post to their blogs and we began investigating the issue. In the process it was decided that a reversion might be necessary. When the Systems Group restored the database containing all the user and post information, they discovered it was corrupt. As they looked for a clean backup they discovered that all backups after Sept. 10 were corrupt and so the earliest version they could restore to is Sept 10.

We are aware that roughly 40 users and hence 40 blogs were lost in this incident. Unfortunately it looks like there is no way to recover the lost users and blogs. It also means that all posts written between Sept. 10 and this morning are gone. We know that this means that several class assignments that were written as blog entries are also gone. We will continue to work with the Systems Group to try to recover as much data as possible, but be aware that it is not likely any more data will be recovered.

In the coming days we will be working with the Systems Group to develop procedures to ensure that such an accident does not occur in the future. Once these procedures have been determined we will post them to this site for the community’s benefit.

Posted in Data, backup, blogs, courses, database, loss, site, users | Comments Off

Latest report on student technology use

Posted by Randy on 26th October 2009

Educause just released their 2009 study of undergrad technology use.  Here are some highlights that caught my attention:

  • Most have newer computers, mostly laptops
  • Cell phones almost universal — 66% have an internet-capable phone or will have one soon — 33% actively use the mobile internet
  • 60% prefer only moderate technology use in the classroom — only 45% think instructors use technology effectively

Personal use of web 2.0 technologies is pretty strong:

  • Social networking sites 95%/wiki editing 42%/blogging 37%/use podcasts 35%

Compare this to how actively these same technologies are used for instructional use:

  • Social networking sites 28%/wiki 25%/blogging 12%/use podcasts 6%.

It is not clear why the use is so much lower in instructional settings, but perhaps the low opinion of instructors effective use of technology has bearing here.

The study also finds a sharp rise in communication via mobile texts and social networking sites.  Around this time of year I always wonder how the new class of students are communicating with each other electronically, and how that has changed over time.  With the increased use of mobile text messages, plus tools like Facebook and Twitter, how important is email?  We have had a student list-serve mailing list since 2004.  Has use of this list changed over that time?  The short answer is no as far as total messages are concerned, but those are being sent by fewer people — details are below.  I have differentiated messages sent by staff/faculty from those which are student-to-student.   I wanted to see if perhaps student use dropped while staff/faculty use rose.

Total email messages sent to list by type
Messages Business B% Student S%
Sept-20004 235 83 35% 152 65%
Sept-20005 200 89 45% 111 56%
Sept-20006 194 90 46% 104 54%
Sept-20007 192 65 34% 127 66%
Sept-20008 238 77 32% 161 68%
Sept-20009 200 58 29% 142 71%
Total Author count by type
Authors Business B% Student S%
Sept-20004 87 23 26% 64 74%
Sept-20005 85 20 24% 65 76%
Sept-20006 88 26 30% 62 70%
Sept-20007 71 21 30% 50 70%
Sept-20008 76 19 25% 57 75%
Sept-20009 68 19 28% 49 72%

Note that only 25 – 30 % of our students actively email to the list.  But as the results show, email is still as popular for our student-to-student communication as it was 5 years ago, but fewer people are actively participating.  So maybe they are communicating more, but email is still a strong part of that.  Almost every student at the school is subscribed to the list.  My guess, (and with quick look over the email subject lines supporting), is that the list is being used for school business (lost/found, get-your-tickets-for, etc), and not classroom or  socializing purposes.

The ECAR Study of Undergraduate Students and Information Technology, 2009 | EDUCAUSE

Since 2004, the annual ECAR Study of Undergraduate Students and Information Technology has sought to shed light on how information technology affects the college experience. We ask students about the technology they own and how they use it in and out of their academic world… In addition to studying student ownership, experience, behaviors, preferences, and skills with respect to information technologies, the 2009 study also includes a special focus on student ownership and use of Internet-capable handheld devices.

Post to Twitter

Related Posts

Posted in Data, Learn, student, students | Comments Off

Hardening WordPress and scanning for past exploits

Posted by Randy on 26th October 2009

The WordPress Exploit Scanner plugin scans system files, posts/comments and plug-ins for suspicious stuff.  If you have a blog site that you think may have been compromised it can’t hurt to give it a try.  I ran it on a couple of blogs I administer and I’m happy to report that everything looks fine.  One of the things the plug-in looks for is hidden code in your site, especially hidden style elements. This is a way that spammers can insert code into your site — but there are lots of legitimate reasons for these elements too. So the report can look a little alarming or overwhelming at first, so run it when you have some time to scan over the output report.  A good tool to keep handy for when needed.

WordPress Exploit Scanner

This plugin searches the files and database of your website for signs of suspicious activity. It will not stop someone hacking into your site, but it may help you find any uploaded or compromised files left by the hacker.

WordPress › Blog » WordPress 2.8.5: Hardening Release

We recommend that all sites are upgraded to this new version of WordPress to ensure that you have the best available protection…If you think your site may have been hit by one of the recent exploits and you would like to make sure that you have cleared out all traces of the exploit then we would recommend that you take a look at the WordPress Exploit Scanner.

Post to Twitter

Related Posts

Posted in Data, PLE, Technology, Web, WordPress, blog, database, plugin | Comments Off

OpenCalais semantic extraction service

Posted by Randy on 21st October 2009

What is OpenCalais, and what is this semantic web stuff all about?  Sure I watched the video on their site, and read through the documentation.  Somehow this all will make web content in general, and these blog posts in particular, easier to find and link with other relevant information.  Which all sounds good, but I want to see it in action.  So I installed their Tagaroo WordPress plug-in.

The most immediate change is the addtion of the tagaroo tag area, which suggest tags based on the post content.  It is pretty cool too, as it dynamically updates and suggest new tags as you add content.  It also has a Flickr image suggestion bar, which isn’t working at the moment, but also doesn’t hold any interest for me — I don’t see how random Flickr image additions enhance this content.  I went back to a recently published post and added in all of the suggest semantic tags.  When looking at the page mark-up I don’t see any indication of tagaroo/opencalais’ presence.  Maybe it is posting information back to the OpenCalais servers?  I did need to register for an API key.  I’ll play with it a little more, but if it helps make this content more semantic, then why not?

How Does Calais Work? | OpenCalais

We want to make all the world’s content more accessible, interoperable and valuable. Some call it Web 2.0, Web 3.0, the Semantic Web or the Giant Global Graph – we call our piece of it Calais.

Oracle Database integrates OpenCalais | OpenCalais

Tight integration empowers Oracle Spatial 11g Release 2 users to deploy production-strength semantic solutions with unprecedented speed.

wordpress | OpenCalais

Tagaroo provides automated tag generation and image location for WordPress bloggers. We like Tagaroo so much that we gave him his own website. If you’re a WordPress blogger and would like to integrate Calais functionality directly within your blogging life then hop on over to Tagaroo.

Tagaroo » Make blogging better!

Tagaroo is designed to make your WordPress blog better for you, better for your readers and more accessible to search engines. As you’re writing, Tagaroo analyzes the text in your post and suggests intelligent tags for the things and events you’re writing about.

Post to Twitter

Related Posts

Posted in Data, Design, Technology, Web, Web 2.0, WordPress, access, blog, blogging, content, database | Comments Off

WordPress in Website Magazine’s top 10

Posted by Randy on 19th October 2009

Website Magazine’s listing of top sites for web pro’s has WordPress.com in the top ten — and twitter only number 12.  Not sure what it means, if anything, but it is interesting to see what their “proprietary method” thinks is important.

Top 50 Websites for ‘Net Professionals – Website Magazine – Website Magazine

Website Magazine’sTop 50 rankings are a measure of a website’s popularity. Ranks are calculated using a proprietary method that focuses on average daily unique visitors and page views over a specified period of time as reported by multiple data sources. The sites with the highest combination of factors are ranked in the first position.

Post to Twitter

Related Posts

Posted in Data, Lifestyle, PLE, Twitter, Web, WordPress, blog, content | Comments Off

Dramatic new Bates College Website — powered by WordPress

Posted by Randy on 7th October 2009

I have been following the Bates Online Media blog for about a year.  They have been blogging as they’ve worked through a pretty dramatic redesign of the college web site.  The fact that is built on WordPress is probably the least important feature (but the WordPress geek in me does thrill just a bit at the news.)  With my eldest now a freshman at Springfield college, we all spent a lot of time as consumers on college sites — and from the user perspective the Bates site is very smartly laid out, and is easy to use.  Nice to see such a great end-result after a careful and productive planning process.

Bates College goes beyond the usual homepage redesign with Home 4 running on WordPress | collegewebeditor.com

We have been managing the site in WordPress since the beginning, first as a proof-of-concept with student assistants at WordPress.com, then as a working prototype with WP 2.8 software on an external hosting service, and now hosted on a campus Web server…The slideshows are handled with NextGenGallery, with the overlays via Thickbox and jquery. We’re using a number of plugins to enable shortcodes for editors along with WP-Table Reloaded for organization of tabular data originating in DabbleDB. We had over 30,000 views on launch day — about double the load of an average day — all served by WP-SuperCache.

Bates College

Bates Views is a site of thoughtful text, images, audio and video. Click a category below to expand

One Bates. Many Journeys. « Bates Online Media

I’m sharing the current draft of a vision paper on ways such an education might be expanded through online collaboration.

It’s 12-pages long, so here is a PDF version (2.1 MB download). It’s an evolving draft, so please send comments and suggestions for improvements.

Related Posts

Posted in Data, Design, PHP, PLE, Web, WordPress, audio, blog, blogging, campus, education, planning, plugin, plugins, student, video | Comments Off

Open Education: Talis Incubator Proposal

Posted by Joss Winn on 4th October 2009

Back in May, I woke up with an idea in my head which, in a slightly modified form, I’d now like to try and find funding for.1 The idea is based on work we’re doing on our JISCPress project, which itself is based on work Tony and I have been doing with WriteToReply since February. In my original blog post, I proposed that WordPress Multi User2 and Scriblio, a set of plugins for WordPress which allows you to import an OPAC library catalogue and benefit from all the advantages of the WordPress ecosystem, would together allow libraries to host independently branded catalogues on an open, union platform.

Imagine that JISC, Talis or Eduserv offered such a platform to UK university libraries. It could be a service, not unlike wordpress.com, where authorised institutions, could self-register for a site and easily import their OPAC, apply a theme, tweak some CSS, choose from a few useful plugins, and within less than a day or two, have a branded, cutting-edge search and browse interface to their OPAC, running under their own domain.

Paul and I gave a Lightening Talk about this at Mashoop North, which I present to you below.

Slide four is the useful one. It show the various slices of the platform and, by implication, the various uses each layer offers.  The bottom slice shows the OPACs converge with WPMU to the benefit of the institution. It’s a nice, easy, hosted service that would offer an end-user experience not unlike the one that Plymouth State offer to their users. The middle slice – the WPMU bit – is where the OPACs converge together in union, under a single administrative interface that is easy to manage, widely used and supported. For $5000/year, Automattic, the company that leads the development of WordPress and runs wordpress.com, would provide support and advice with a six hour SLA. On top of that, anyone with a knowledge of PHP, can quickly learn the guts of WordPress, as Alex who’s working on JISCPress, will testify. My point is that this is a well tested and widely understood technology.

Now, once you have one or more OPACs hosted on WPMU, you bring together a lot of library catalogue data into one database and the platform’s web analytics (i.e. usage trends) can be a rich source of data for learning about what library users are looking for. Each library, would have access to their own analytics, while the analytics for the entire platform would also be collected. I do this on our university WPMU installation.

The next slice in our diagram, shows a few different ways of getting data out of the platform (and this would also apply to each individual catalogue site, too).  First, you can see that the platform as a whole could act as a union catalogue where, from a single site, users could search across library holdings. That union catalogue would have all the useful features of WordPress, too. Next to that, you can see Triplify, a nice little web application that transforms a relational database into RDF/N3, JSON and Linked Data. Triplify could re-present the data in each catalogue as semantic data and this could be subsequently hosted on the Talis platform.  We’re already doing this with JISCPress. Every night, changes to any of the library catalogue data could be pushed to Talis, where the data can be queried and mashed up using the Talis API. Finally, don’t forget good old RSS and Atom feeds, which are available for almost every WordPress endpoint URL, as I’ve previously documented.

Given the work we’ve done on JISCPress, which covers our experience with WPMU and Triplify, I think that a demonstrator prototype, using entirely open source software, could be developed within the constraints of the Talis Incubator fund. I canvassed my original idea to the Scriblio mailing list and had positive and useful feedback from Ross Singer at Talis. Leigh Dodds at Talis also sees potential in the use of WPMU and Triplify, although I understand that neither of these people are endorsing the idea for the Talis Incubator fund, but their interest has been encouraging.

So, what I’m proposing is that Paul and I work with Casey Bisson, the Scriblio developer, on a short project to get this all up and running. In my mind, Scriblio needs some more work to make the set up process easier for a variety of library catalogues and the last time I looked, it needed documenting better, too. I think that the maximum of £15,000 from the Talis fund is workable. In fact, I’d like to bring it down a little to make it more attractive to the judges. Paul would bring his knowledge and expertise from working with our university library catalogue, I would bring what we’ve learned from JISCPress and could manage the WPMU server side of things and the project in general, as well as write documentation, while Casey could be funded to spend some dedicated time fine tuning Scriblio to meet our objectives.

So what do you think? A wordpress.com like platform for library OPACs that pushes semantic data to the Talis platform. Each catalogue remains under the control of its owner institution, while contributing to a wider union OPAC that will benefit users and offer the library community some useful analytics. The platform as a technology, would be as flexible as WordPress itself is, so additional features could be developed for the platform by other future projects. Only last week, Tony was discussing on his new Arcadia project blog, how it would be useful to be able to capture library catalogue links as QR codes. Well, using WordPress in the way I’ve described, we could implement that across every UK HEI Library catalogue in a snap using this plugin. Hoorah!

  1. I figure that if I repeat this idea enough times, someone will see that it’s worth funding ;-)
  2. and here I’ll repeat what is becoming my mantra: ‘the same software that runs six million blogs on wordpress.com’

Related posts

Posted in Data, Fun, Funding, Libraries, Mashups, Open Education, Projects, Scriblio, Triplify, Web, commons, talis, wpmu, wpmudev | Comments Off

Mashoop North!

Posted by Joss Winn on 7th July 2009

Paul and I have just presented our ‘lightning talk’ on the use of WordPress MU and Scriblio to create a platform for publishing multiple OPAC catalogues and then exposing the aggregate data as RDF using Triplify. I blogged about this idea a while back and this is the first presentation we’ve given. Not sure what people made of it. Too ambitious? Threatening? Confusing? All I know is that from where I’m standing, it would require a relatively small amount of funding to show it working in principle with a handful of library catalogues. The difficult part would be scaling it to work for 100+ catalogues (though bear in mind, wordpress.com hosts 6 million sites) and satisfying the politics of each institution. Still, that shouldn’t stop us from trying.

Related posts

Posted in Data, Fun, Libraries, Mashups, OPAC, Projects, Triplify, WordPressMU, library, library2.0, mashlib09, semanticweb, wpmudev | Comments Off

Scriblio, Triplify and XMPP PubSub

Posted by Joss Winn on 17th May 2009

It occured to me this morning, as I woke from my slumber, that the work I’ve been doing recently with WordPress, could also be applied to a library catalogue using Scriblio.

Scriblio (formerly WPopac) is an award winning, free, open source CMS and OPAC with faceted searching and browsing features based on WordPress. Scriblio is a project of Plymouth State University, supported in part by the Andrew W. Mellon Foundation.

Which means that you can import your library catalogue into WordPress and the user can search for and retrieve a record for The Films of Jean-Luc Goddard. Have a look around Plymouth State’s Scriblio and you’ll get a good feel for what’s possible.

Anyway, taking Scriblio’s functionality for granted, you could easily add Triplify to the mix as I have discussed before. So with very little effort, you can convert your library catalogue to RDF N-Triples (and/or JSON). My questions to you Librarians is: knowing this is possible and fairly trivial to do, is there any value to you in exposing your OPACs in this way?

Next, as I lay listening to my daughter chat to her squeaky duck, I thought about the other stuff I’ve been looking at recently with WordPress.  Once you think of your library catalogue as a WordPress site, there’s quite a lot of fun to be had.  You could ramp up the feeds that you offer from your OPAC, use the OpenCalais API to add semantic tags, plugin some more semantic addons if you wish (autodiscovery of SIOC, FOAF, OAI-ORE data??), and, perhaps most fun of all, publish OPAC records in realtime over XMPP PubSub.

Which brings me to JISCPress, our recent #jiscri project proposal, which we may or may not get funded (what are we, a week or two away from finding out??).  In that Project, we’re proposing a WordPress MU platform for publishing and discussing JISC funding calls and project reports (among other things).  There’s a lot of cross-over between the above Scriblio ideas and JISCPress. So much so, that it’s probably no more than a days work to transform the JISCPress platform, hosted as an Amazon Machine Image, to a multi-user OPAC platform where, potentially, all UK University libraries, publish their OPACs via separate Scriblio sites.

You could then, like wordpress.com has done, publish an XMPP firehose from every catalogue over PubSub for search engines or whoever is interested in realtime data from UK university library catalogues. Alternatively, instead of the WPMU set up, each University library could maintain their own Scriblio install and publish an XMPP feed to an agreed server (though that approach seems like more hassle than is necesary if you ask me. You’re bound to have some libraries falling behind and not upgrading their sites as things develop. For less than a collective £4K/year, we could all buy into commercial support for a WPMU site from Automattic to help maintain server-side stuff).

I dunno. Maybe this is all off the wall, but the building blocks are all there. Is anyone experimenting with Scriblio in this way? Don’t tell me, a bunch of you have been doing it for years…

Related posts

Posted in API, Andrew W. Mellon Foundation, Data, Fun, GBP, Jean-Luc Goddard, Libraries, Mashups, OPAC, Plymouth State University, Scriblio, Standards & Specs, Triplify, UK University, United Kingdom, University library, Web, WordPress, catalogue, jiscri, library, pubsub, rdf, search engines, semanticweb, wpmu, wpmudev, xmpp | Comments Off

Getting your Triples into Talis Connected Commons

Posted by Joss Winn on 29th April 2009

A few days ago, I wrote about adding Triplify to your web application. Specifically, I wrote about adding it to WordPress, but the same information can be applied to most web publishing platforms. Earlier this month, TALIS announced their Connected Commons platform and yesterday they announced a commercial version of their platform for the structured storage of Linked Data. Storage is all very well, but more importantly they have an API for developers, so that the data can be queried and creatively re-used or mashed up.

So this got me thinking about JISCPress, our recent JISC Rapid Innovation Programme bid, which proposes a WordPress Multi-User based platform for publishing JISC funding calls and the reports of funded projects. This is based on my experience of running WriteToReply with Tony Hirst.

Although a service for comment and discussion around documents, one of the things that interests me most about WriteToReply and, consequently the JISCPress proposal, is the cumulative storage of data on the platform and how that data might be used. No surprise really as my background is in archiving and collections management. As with the University of Lincoln blogs, WriteToReply and the proposed JISCPress platform, aggregate published content into a site-wide ‘tags’ site that allows anyone to search and browse through all content that has been published to the public. In the case of the university blogs, that’s a large percentage of blogs, but for WriteToReply and JISCPress, it would be pretty much every document hosted on the platform.

You can see from the WriteToReply tags site that over time, a rich store of public documents could be created for querying and re-use. The site design is a bit clunky right now but under the hood you’ll notice that you can search across the text of every document, browse by document type and by tag. The tags are created by publishing the content to OpenCalais, which returns a whole bunch of semantic keywords for each document section. You’ll also notice that an RSS feed is available for any search query, any category and any tag or combination of tags.

Last night, I was thinking about the WriteToReply site architecture (note that when I mention WriteToReply, it almost certainly applies to JISCPress, too – same technology, similar principles, different content). Currently, we categorise each document by document type so you’ll see ‘Consultations‘, ‘Action Plans‘ ‘Discussion Papers‘, etc.. We author all documents under the WriteToReply username, too and tag each document section both manually and via OpenCalais. However, there’s more that we could do, with little effort, to mark up the documents and I’ve started sketching it out.

You’ll see from the diagram that I’m thinking we should introduce location and subject categories. There will be formal classification schemes we could use. For example, I found a Local Government Classification Scheme, which provides some high level subjects that are the type of thing I’m thinking about. I’m not suggesting we start ‘cataloguing’ the documents, but simply borrow, at the top level, from recognised classification schemes that are used elsewhere. I’m also thinking that we should start creating a new author for each document and in the case of WriteToReply, the author would be the agency who issued the consultation, report, or whatever.

So following these changes, we would capture the following data (in bold), for example:

The Home Office created Protecting the public in a changing communications environment on April 27th which is a consultation document for England, Wales and Scotland, categorised under Information and communication technology with 18 sections.

Section one is tagged Governor, Home Department, Office of Public Sector Information, Secretary of State, Surrey.

Section two is tagged communications data, communications industry, emergency services, Home Secretary, Jacqui Smith MP, Rt Hon Jacqui Smith MP.

Section three is tagged Broadband, BT, communications, communications changes, communications data, communications data capability, communications data limits, communications environment, communications event, communications industry, communications networks, communications providers, communications service providers, communications services, emergency services, Her Majesty’s Revenue and Customs, Home Office, intelligence agencies, internet browsing, Internet Protocol, Internet Service, IP, mobile telephone system, physical networks, public telecommunications service, registered owner, Serious Organised Crime Agency, social networking, specified communications data, The communications industry, United Kingdom.

Section four is tagged …(you get the picture)

Section five, paragraph six, has the comment “fully compatible with the ECHR” is, of course, an assertion made by the government, about its own legislation. Has that assertion ever been tested in a court? authored by Owen Blacker on April 28th 11:32pm.

Selected text from Section five, paragraph eight, has the comment Over my dead body! authored by Mr Angry on April 28th 9:32pm

Note that every author, document, section, paragraph, text selection, category, tag, comment and comment author has a URI, Atom, RSS and RDF end point (actually, text selection and comment author feeds are forthcoming features).

Now, with this basic architecture mapped out, we might wonder what Triplify could add to this. I’ve already shown in my earlier post that, with little effort, it re-publishes data from a relational database as N-Triples semantic data, so everything you see above, could be published as RDF data (and JSON, too).

So, in my simple view of the world, we have a data source that requires very little effort to generate content for and manage (JISCPress/WriteToReply/WordPress), a method of automatically publishing the data for the semantic web (Triplify) and, with TALIS, an API for data storage, data access, query, and augmentation.  As always, my mantra is ‘I am not a developer’, but from where I’m standing, this high-level ‘workflow’ seems reasonable.

The benefits for the JISC community would primarily be felt by using the JISCPress website, in a similar way (albeit with better, more informed design) to the WriteToReply ‘tags’ site. We could search across the full text of funding calls, browse the reports by author, categories and tags and grab news feeds from favourite authors, searches, tags or categories. This is all in addition to the comment, feedback and discussion features we’ve proposed, too. Further benefits would be had from ‘re-publishing’ the site content as semantic data to a platform such as TALIS. Not only could there be further Rapid Innovation projects which worked on this data, but it would be available for any member of the public to query and re-use, too. No longer would our final project reports, often the distillation of our research, sit idle as PDF files on institutional websites and in institutional repositories. If the documentation we produce it worth anything, then it’s worth re-publishing openly as semantic data.

Finally, in order to benefit from the (free) use of TALIS Connected Commons, the data being published needs to be licensed under a public domain or Creative Commons ‘zero’ licence. I suspect Crown Copyright is not compatible with either of these licenses, although why the hell public consultation documents couldn’t be licensed this way, I don’t know. Do you? For JISCPress, this would be a choice JISC could make. The alternative is to use the commercial TALIS platform or something similar.

As usual, tell me what you think… Thanks.

Related posts

Posted in API, Data, Funding, JISC Rapid Innovation Programme, JISCPress, Mashups, Open Access, Projects, Standards & Specs, University of Lincoln, Web, commons, communication technology, jiscri, rdf, relational database, semantic web, talis, web application, web publishing, web publishing platforms, wpmudev | Comments Off