Takeaways and Some future plans
From CS219
Dear All (please forward to Sasank--don't have his email here just now, same for Jeff),
As I think back on our conversations yesterday, I draw a few lessons:
1. It is crucial that we record the intentions of the user. I did not realize there were leaders, that there were plans, etc. As Deborah suggested, we want to record that on the video, or at least on one video. Clips that seemed unfocussed to me, now appear deliberate and perhaps more interesting. Also, having the user's observations (not only intentions) should be helpful. And those observations won't usually make the ambient sound less clear.
2. For a class in Urban Sensing, it would be helpful for students to read some of the scholarly work on cities and urban life. For example, the book City by Whyte is terrific on visual documentation and what you can see. Perhaps a collaborator on projects from urban planning or sociology or... {or do you have these already??}. When people say the scene is "uninteresting" and it is not clear what to focus on, I take that to mean this is not a situation they have learned to see. [Much of what I have seen on computer science and cities, say using GPS etc, would be much more powerful knowing more about what we know about cities.] This has implications for our next set of experiments and demonstrations, hopefully this summer with CENS summer students (Deborah, Do you see this as possible?). I need to work with them, not for five minutes, but perhaps for an hour, to start them thinking about what they might see. On the other hand, people have different interests, curiosities, and fetishes/obsessions. So their untutored explorations might be quite interesting. Some of my work is of this sort (plants growing between the cracks in the concrete on ordinary streets).
3. Enthusiasm matters--and I am grateful for yours. The Roof group reminds me of when I was at the Port at the top of a crane that moves containers. Some questions for the "users": Was it easy to use the phones as now vcaps equipped, with few glitches (you are sophisticated users, but most won't know a thing about Symbian...)? Was it hard to hold the phones steady, when with ordinary cameras you have your head providing some support--here you are canteleavered out on your arms. Were you satisfied with the posted clips?
Ramesh and Moo Ryong and I met later in the day yesterday. Eventually we need to develop a much more flexible display, think about diffusing the technology (including the notion of "video albums") so that people might establish their own catnips (servers) and systems easily, deal with power conservation on the phones, develop better ways of dealing with poor WiFi connections (since packet internet from the providers is in practice still perhaps 1/10th the speed of WiFi, although more reliable?), think through the bandwidth/quality tradeoff see if compression schemes might help..., and be on the lookout for new smartphones that are as capable and less expensive than the N95 (whose major advantage, besides its until now unique capabilities, is its widespread popularity, and so lots of hacks and lots of firmware revisions).
In the shorter run, we want to be able to tag the video clips (to say which project is being pursued) on the phone; see if we can fix the synch of the sound (slightly off, at least for the intermediate quality videos); develop much better training regime for the users (my main agenda); think through non-tripod ways of holding the N95--eg at the end of a string, for stability; improve the sound (eg. use the little microphone on the extension for the earphones) without too much extra equipment. There are a number of other such "little" improvements.
In the medium term, this summer, we need to do some convincing demos of the system, where I have figured out how to train people to be more effective in their video-ing. A busy market, traffic, people walking on the street, a complex place or street, a laboratory building... --standard topics in urban planning. See theWhyte book. We did USC's Commencement, and if you go to catnip.usc.edu you can see maybe 100 video clips. Perhaps you want to do UCLA's Commencement, but now with many more cellphones and much wider coverage, including graduates and other marchers presenting their viewpoint. Interviews, as you can see, are very helpful. (I would be glad to come by to convey the lessons we learned from our attempt at USC last Friday). Wide-views have the problem of lost detail given the QVGA, and even in VGA are not so good.
I would be grateful for other suggestions. Have I left out something important/
Now for the thank-yous: I may talk a lot, but this project would not exist without Ramesh and Moo Ryong (ideas are cheap, implementations are not). That CENS and Deborah and Mohammed and Sasank are interested gives me some confidence and support for future work. More personally, working with all of you has been wonderful, a gift for which I am grateful.
Martin
Thank you for your great summary. In my opinion, user interface is great but has a lot of room for improvement. In particular, beside the temporal and spatial view, tag view certainly is helpful. To simultaneously see videos that have been similarly tagged, or navigate across relevant tags by clicking on the relevant tags and seeing relevant video content, combining tag view with temporal view...etc.
I cced the class email list in case others want to add something that could not be discussed due to lack of time yesterday.
Sincerely
Mohammad Rahimi
