Archive for May 15, 2006

Classes are over. Let’s get to work!

Classes at UHS ended yesterday. Now the work of our department really heats up. In a similar manner to the construction crews on campus, we have just under three months to take everything apart, order and configure the new bits, and put it all back together again in time for the start of school. Of course, I will also hand the reins off to a new director partway through this race!

For some reason, it is a well-kept secret how much work we do during the summer. While most of the school takes its vacation, we take advantage of the departure of most of the users to upgrade facilities and perform routine maintenance on systems and servers. That way, we start the year with a configuration built to last and the ability to give our users the attention they need during the year. Schools that employ their tech staff on a school year schedule should consider extending them to year-round. The additional cost is most worth it!

Today, we removed just about every bit of computer equipment from the lower campus in anticipation of the start of construction next week. PC Lab, Mac Lab, physics classroom computers, arts classroom computers, office printers are all now upstairs in our staging area near our offices. Our students were helpful in transporting pieces up to the Library. Next year, the lower campus will have an elevator and ADA ramps, which will be so helpful!

Yahoo! completes its transformation

Yahoo! recently released a new home page design, practically eliminating the feature that got them started. Yahoo! started as a human-built web directory. You could submit your site a category, and someone might review it and add it to the collection. Yahoo! Picks was a cool offshoot of this feature, marking one cool new site per day. (You can now subscribe to Picks via RSS!) InfoSeek challenged Yahoo!’s model by writing algorithms to crawl the web and build a searchable index based on key words in web pages. For a while, the two methods remained on equal footing, until Google came along and based search relevance on the number of links to a site. At the same time, Yahoo! vastly broadened their services by acquiring or developing dozens of sometimes unrelated features, for example mail, weather, games, videos, music, home pages, and groups.

You can track the descent into obscurity of the Yahoo! Directory with each revision of the Yahoo! home page. I have circled the directory portion in each version.

1996
1996

1999
1999

2002
2002

2004
2004

2006
2006

Current
current

How do you like the new design?

I’m not a big fan of the new Yahoo! design. It is challenging to create a portal home page that caters to such a wide audience and yet remains simple and easy to use. Yahoo! duplicates information in different places (Sports, for example, appears in the left-hand menu and the “Featured” block. In the previous version, Sports was a subcategory of News. Now it’s a subcategory of Featured. Why? Mail appears both in a button on the left and in the nifty AJAX popup on the right. Overall, the page is crazy busy in order to fit all of the commercial and service functions in one page. AJAX layers are cool, but hiding content within the page just adds to its complexity and ability to confuse. I also preferred the darker borders of the previous versions. The light blue border color disappears, making the blocks all run together.

On the other hand, one rule of portal design is that people will get used to the locations of items and return to them repeatedly as long as their position doesn’t change too often. What is confusing at first glance may become familiar if you revisit it often enough. A quick scan of the Yahoo! home page history suggests that they change their design about every two years. This is the most significant rearrangement of the items ever. Previous changes were more about adding new blocks than moving or eliminating them.

What do you think?

What Is “Most Popular?”

The right-hand column of this blog notes the five most popular posts. It uses the MostViewed Nucleus plug-in, which displays the items that have received the most hits. This method favors old posts, since the longer they are on the site, the more hits they get (mostly from Google). MostViewed uses the following SQL query to find the top five most popular items:

"SELECT i.inumber id, v.views views, i.ititle title ".
"FROM ".sql_table('plugin_views')." v, ".sql_table('item')." i ".
"WHERE v.id = i.inumber ".
"ORDER BY views DESC ".
"LIMIT 0, ".intval($numOfPostsToShow);

I wrote a new query to take the item’s age into account. This produces the top posts by calculating the rate of hits over time.

SELECT i.inumber id, v.views views, i.ititle title,
((UNIX_TIMESTAMP(NOW()) - UNIX_TIMESTAMP(i.itime))/v.views) AS pop
FROM nucleus_plugin_views v, nucleus_item i
WHERE v.id = i.inumber AND i.itime > '1970-01-01'
ORDER BY pop;

(Because of rounding, this calculates the ratio of the number of seconds the post has been online to the number of hits and then sorts ascending. The 1970 limit keeps drafts out of the results, as draft posts have a time of 0, which mySQL stores as 1969-12-31)

This produced the opposite effect, favoring recent posts. This is because old posts are found primarily through Google search results, whereas new posts are found through RSS subscriptions and home page views in addition to web searches. The rate of hits taper off once the item has disappeared from these two sources.

Another plugin, MostPopular, determines popularity by the number of comments rather than the number of page views. I don’t prefer this approach, but most readers are lurkers, and it doesn’t make a lot of sense to exclude most readers from a measure of popularity.

Periodically resetting the view counts appears to produce the best results, giving newer items a fair shot to rise to the top, but it requires manual intervention from time to time. Since the Views plugin stores hit totals, not individual hits, it’s not possible to count hits only since a specific date, or to exclude recent hits that are a result of post prominence.

If I really want to capture a more accurate measure of popularity (a questionable endeavor at best), then I should modify the Views plugin to log individual hits by date and then plot a frequency distribution of hits over time. I expect this would produce a bubble in the first couple of weeks of a post’s existence when the item is within the RSS feed and home page, and then taper off to a baseline popularity level based on search engine and link hits. Once the hits are stored by date, it would be possible to measure and correct for the bubble effect or implement a cutoff date in order to capture just the baseline popularity rate.

I didn’t know that mySQL had so many functions for performing calculations! That’s one unexpected benefit of this investigation.

Import Address Lists Into Google Earth

Google Earth offers an address import feature as part of its “pro” application, which is currently only available for Windows. However, there are a number of tools out there that will perform this conversion for free and produce a KML file, which you can import into Google Earth or other mapping applications. One such tool is BatchGeoCode. It is completely web-based, uses an easy Excel template (provided), and gives you the option to download the finished KML file. The only limitation is a 100-address limit for the conversion.

For fun, I mapped the addresses of the University High School and Catlin Gabel Faculty and saved the images from the same altitude (29 miles).

UHS
University High School: large image

CGS
Catlin Gabel School: large image

DokuWiki Knows Environmental Variables!

I have written before about the practice of using environment variables to identify authenticated users in our school intranet. Now, finally I have come across an open-source package that uses this method, too! I have installed DokuWiki on our internal server for a collaborative project our diversity club students are starting on San Francisco neighborhoods. This will replace our old UseModWiki that we have had in place for three years.

DokuWiki automatically picks up the $_SERVER['REMOTE_USER'] environment variable, which is set on our IIS server for protected directories. Right out of the box, we get a wiki script that can stay private within our network and automatically identify the authors of all of the wiki edits.

Two small edits were immediately helpful. The first was to remove the domain from the userid, so that it would be shorter and easier to read. The second was to insert a database query against Blackbaud to show the real name of the currently logged in user. The second edit may not ultimately prove that useful, as the userid seems to be used throughout the script and the real name only sparingly.

Interestingly, DokuWiki tracks the current user by setting the REMOTE_USER environment variable if the server folder does not require authentication and DokuWiki is set to use internal authentication instead. For example, if you turn access control on and then log in, DokuWiki sets the environment variable to the new userid. Most programs use a cookie for this purpose.

I was hoping to give everyone editing privileges to the entire wiki, but it is likely that the diversity club will want to maintain editorial control over the content that it is currently working hard to assemble. Toward this end, I have enabled access control lists, but this stops the script from using REMOTE_USER by default, in order to require login first. Too bad! If I want to use access control lists, I will have to modify this script in the same fashion I have modifed the others, pre-empting the login form by capturing REMOTE_USER and creating the logged in state before the script checks for that.

Look for references to REMOTE_USER in /inc/common.php to find examples of its use in DokuWiki. This is where DokuWiki first determines user identity and invokes an authentication process if necessary.

Emilda, another open-source library system

Thanks to a recent comment, we have become aware of Emilda, another open-source library software system. This one appears to be further along in terms of features but has a smaller development community. It also is not mature on Windows IIS, so it appears that we will have to install a Linux server if we want to try either Koha or Emilda, or both.

Visual Earth vs. Google Maps

Now I really need a MacBook Pro with Windows and Parallels!

real estate map

URL: http://www.johnlscott.com/SearchInteractive.aspx

This is the greatest real estate searching tool, but I bet it would run on my Mac were it built on Google Maps instead of Microsoft Visual Earth!

Documenting Our Practice

We don’t do a great job of keeping our documentation up to date. Now with me leaving and Cécile coming, it makes a lot of sense to document as much as possible of what we do, so that the transition is as smooth as possible. Though it may not be the conventional wisdom, we just don’t have the time to produce such complete documentation the rest of the time, and — more importantly — keep it up to date!

Plone never caught on as a general-purpose tool at UHS, because the graphic template was too rigid, Zope/Python were too unfamiliar to us, and the Active Directory integration was incomplete in the free version (it doesn’t support groups well enough). Plone is still perfect for internal, tech department documentation, and I still feel it is kind of like a wiki on steroids. When I finish building our new web server, I will install Plone and bind it to Active Directory, but I won’t make it available to most of our school community.

Here is my tech documentation Plone folder so far. I have a long way to go, but I am just trying to add one important topic per day as issues in that area arise.

plone

MacBooks Well-Timed

Apple’s announcement of the MacBook is well-timed for us. Our seven language teachers are due to have their iBook G3s replaced this summer, according to our three-year laptop replacement cycle. We only bought three iBook G4s last summer to accommodate new faculty members. Last time, the language teachers received machines at the end of the cycle. This time, they will be on the leading edge! Jim Heynderickx (hey, that’s the first time I spelled it correctly from memory!) now has a dilemma on his hands. We are in a safer position, only risking the purchase of seven new machines on this first-generation hardware. I wonder what proportion of Catlin Gabel ninth grade students will buy Apple?

Apple claims that the new machines are five times as fast as the iBook G4. Of course, we remember Apple’s checkered history of performance promises, don’t we!

I like that the MacBook is available in black. I will probably still buy white.

If the language teachers desire, we should be able to install Windows on these Macs as well. Our Microsoft education license agreement already requires us to count Macs in our total that we submit each year.

New Voting Script

vote screenshot

I rewrote our voting script to be consistent with new programming skills learned in the last two years. The script now uses mySQL instead of a flat text file, builds the input form dynamically instead of using a static HTML file, and provides live results to admins. I provided a way to dynamically select among different voting methods.

The script currently supports two: plurality and approval. Plurality is conventional voting: each voter gets one vote per office. Approval voting permits the voter to cast as many votes as desired. I may build other, more complex voting schemes in the future, especially to provide instant-runoff capability. For example, preference voting permits voters to rank candidates and also perhaps distribute points to different candidates.

As typical, I did not build forms for election setup, since it is so easy for the tech admin to use a SQL admin tool to set up the election. It’s just the lowest priority for me but more important once I have moved on.