Boxes introduction and examples
The Boxes tool is for quick checks and cross checks of lists, particularly those containing student numbers, exam numbers, course codes, URLs, or email addresses.
Video examples: Introduction and more examples.
Boxes tries not to make too many assumptions about what you need to do, or where your data is coming from, while (hopefully!) making some common tasks possible. If you regularly find yourself doing the same thing in Boxes, a tool or spreadsheet that does exactly what you want would be better.
Before looking at examples, here is an overview of what you’ll do:
- Copy text into the boxes from the clipboard with
Ctrl-V
. This text can contain “junk”, as long as it contains something Boxes can identify, such as course codes, student IDs, or a tab-separated table from copying a webpage or spreadsheet. - The first row of options lets you process/tidy the data in a single box. Examples:
- extract course codes, student numbers, URLs, or emails;
- open courses or URLs in new tabs or copy emails to the clipboard;
- find or tidy tab-separated tables;
- remove duplicates, or count how many times they occur.
- The second row of options lets you interact with data in another box. When you select which other box to use, the command is executed. Find differences or intersections between two lists, to help notice when items are unexpectedly missing or duplicated.
The commands in the second row let you compare rows based on any student numbers or course codes that they contain. As you’ll see below, when you use these features, you often don’t need to tidy up the data (the first row of options) at all.
Examples:
- Compare two DPTs. Which courses are only on one of the DPTs?
- Find what’s not being used on a list. Check a DPT against the sortable course list.
- Using multiple sources in different formats
- Some things you can do with a class list
(These examples used 2018/19 data. To navigate between sessions in the DRPS you might like DRPS arrows.)
Opening webpages: The Boxes Open
menu can open new tabs for all of the URLs in a Box, or for all of the course codes in a box (using either the DRPS or the timetable). However, your browser may block the new tabs. In Firefox, increase dom.popup_maximum
in about:config
and also set dom.block_multiple_popups
to false
. In Chrome, look for the popup blocking notification on the right-hand side of the location bar, click on it and say that you always want to allow the popups.
Compare two DPTs
Here we copy in two DPTs that share a lot of courses, and check which courses are only one of them.
Open the 2018/19 DPT for Computer Science (BSc Hons) in a new tab, select and copy the whole page with
Ctrl-A
Ctrl-C
. Then click in Box A of Boxes and paste withCtrl-V
(or right-clicking and selecting paste).Open the 2018/19 DPT for Informatics (MInf) in a new tab, use the mouse to select the first four years, copy with
Ctrl-C
, and paste into Box B of Boxes. You don’t need to copy each part separately, you can copy one block of text in one go, including intermediate “junk”.In Box A select
Compare
in the first drop-down in the second-row. To the right of that, select “FindDRPS codes
only in one box.” Then clickother box
and selectBox B
.Box A should now contain the text below. You can make a text area in Boxes bigger by dragging its bottom right-hand corner.
# Only in Box A: INFR10044 40 credits Honours Project (Informatics) INFR11102 10 credits Computational Complexity # Only in Box B: INFR08010 20 credits Informatics 2D - Reasoning and Agents Must be passed INFR10051 40 credits MInf Project (Part 1) INFR11123 10 credits Scalable Data Management Systems INFR11162 10 credits Neural Computation
Some of these differences are expected. A couple might be mistakes to be checked further.
Find what’s not being used on a list
Most of the courses that Informatics run are currently listed on the MInf degree DPT. Let’s check what’s not there. There are multiple ways to find out, here we prune down the sortable course list and look at what’s left.
Copy-paste the contents of these pages into Boxes:
- Sortable course list (2018/19 snapshot) into Box A.
- MInf degree DPT into Box B.
In the second row of controls for Box A, select “
Discard
row ifCol 2 (EUCLID Code)
is part of a row in
”, and choose Box B as theother box
.Aside: Boxes discarded the junk around the table, because those rows didn’t have a column 2. We could have instead said “
Discard
row ifrow
has DRPS code in
”, but we would have retained the junk. We would then clean the results: in the first row selectExtract
→Rows with tabs (table)
orRows with course codes
.The courses left are those not on the MInf DPT. There are quite a lot left, so we get rid of the bulk of those we know shouldn’t be on the MInf by typing or copy-pasting the following into Box C:
EPCC Distance Thesis Dissertation Project Design Informatics
Then in Box A, change the
Discard
criterion to “ifrow
contains a row in
”, and now choose Box C as theother box
. (Alternatively you could have clickedClear
in Box B and reused it.)At the time of writing, what was left was:
Course URL EUCLID Code Acronym AIA COG FSS ML NS SE Level Points Year Delivery Exam Diet Work%/Exam% Lecturer(s)/Coordinator(s) Computer Programming Skills and Concepts INFR08022 CP 8 20 1 S1 December 20/80 Cristina Alexandru / Ajitha Rajan Decision Making in Robots and Autonomous Agents INFR11090 DMR 11 10 5 S2 April/May 40/60 Ram Ramamoorthy Informatics 1 - Cognitive Science INFR08020 INF1-CG 8 20 1 S2 April/May 40/60 Frank Keller / Christopher Lucas / Richard Shillcock Informatics Research Review INFR11136 IRR 11 10 5 S1 100/0 Bjoern Franke Introduction to Java Programming INFR09021 IJP 9 10 5 S1 100/0 Paul Anderson Introduction to Research in Data Science INFR11138 IRDS 11 20 5 S1 100/0 Amos Storkey Pervasive Parallelism INFR11108 PERP 11 20 5 S1 100/0 Murray Cole Robot Learning and Sensorimotor Control INFR11186 RLSC 11 10 5 S2 April/May 40/60 Michael Mistry
CP is for outside students, and INF1-CG doesn’t need to be on the DPT. IRR, IJP, IRDS, and PERP are MSc/CDT-only courses. That leaves DMR and RLSC… maybe they should be on the DPT…
Using multiple sources in different formats
Imagine I want to find Informatics courses that are Informatics courses listed in the DRPS and on the MInf degree DPT, but not on the sortable course list. Maybe these courses should be the Informatics’ sortable list?
There are many ways to do it:
Copy-paste the contents of the pages into Boxes:
- DRPS courses into Box A
- MInf degree DPT into Box B.
- Sortable course list (2018/19 snapshot) into Box C.
You can label them in the box banners if you might forget which is which.
In the second row of Box A’s controls, select “
Keep
row ifrow
has DRPS code in
” then clickother box
and select Box B. Box A now contains only the Informatics courses that appear on the MInf DPT.Still in the second row of Box A’s controls, change
Keep
toDiscard
and select Box C, to remove courses that are in the sortable list.We see an initially-surprising long list of results:
INFR09009 Computer Architecture Not delivered this year 10 INFR10049 Agent Based Systems (Level 10) Not delivered this year 10 INFR10061 Elements of Programming Languages Not delivered this year 10 INFR10005 Intelligent Autonomous Robotics (Level 10) Not delivered this year 10 INFR11069 Adaptive Learning Environments 1 (Level 11) Not delivered this year 10 INFR11021 Computer Graphics (Level 11) Not delivered this year 10 INFR11049 Computer Networking (Level 11) Not delivered this year 10 INFR11022 Distributed Systems (Level 11) Not delivered this year 10 INFR11129 Formal Verification Not delivered this year 10 INFR11024 Parallel Architectures (Level 11) Not delivered this year 10 INFR11113 Topics in Natural Language Processing Not delivered this year 10
However, we see that none of the courses are running. If that wasn’t obvious, we could discard these rows (type “Not delivered” into Box D and use that). Or sort by delivery mode: change
Only Show
in the first row of controls in Box A toSort A-Z
and then selectColumn
4.
Some things you can do with a class list
Select and copy a class list that you have open in Euclid with
Ctrl-A
Ctrl-C
. Click into Box A of Boxes and paste withCtrl-V
(or right-clicking and selecting paste).At the right of the first row of buttons, next to
Only Show
, clickColumn
and select5 Programme
. You should get a column of text saying which degrees people taking the class are on.Click
Duplicates
andCounts of unique rows
. You’ll get a list like this:17 Computer Science (MSc) (Full-time) 5 Artificial Intelligence and Computer Science (BSc Hons) 1 Programme 1 Semester 2 Courses for Visiting Students MAT
The
1 Programme
entry is spurious, that was the column header…Press
Ctrl-Z
twice to undo the counting and column selection, and you’ll have the class list back. (Ctrl-Y
redoes the changes if you don’t do anything else first.)So we won’t have to undo next time, take a copy of the class list in Box A into Box B. You can do that with standard copy-paste, or click in Box B and type
A
followed byEnter
.In Box B I could see how students are enrolled on the course, by choosing to
Only show
Column
7 Course Mode of Study
, then selectingDuplicates
→Counts of unique rows
. I might get something like:19 CE 2 C 2 E 1 Course Mode of Study
Let’s say I wanted to look at the rows for students that are taking the exam (
CE
orE
) in Box A. Three of the possibilities:You could change
Only show
toSort AZ
, then sort the column and look for these two blocks. But that’s unwieldy as Boxes doesn’t render a large table nicely.To filter the rows, edit Box B by hand to contain only:
CE E
You can initially extract the second column using the
Only show
Column
feature if you want. Then in Box A, “Keep
row ifCol 7 (Course Mode of Study)
is a row in
”, and choose Box B.Hack: We could look for a column that ends in a capital E (you might get false positives). In Box C enter an
E
thenInsert tab
. Then in Box A you can select “Keep
row ifrow
contains a row in
”, clickmatch case
(or false positives are likely), then forother box
select Box C.
If I wanted to email just the students taking the exam, I could now select
Extract
→Student numbers as emails
. If my email client needs a comma separated list, I can chooseEdit
thenLine breaks → commas
. Then copy all the email addresses withCtrl-A
Ctrl-C
.Or in another box I could
Extract
→Student numbers
on some other list, for example a list of those who had submitted an assignment. Then in the box with those taking the exam,Extract
→Student numbers
followed byDiscard row if row is a row in
the box with assignment student numbers. I’ll be left with those who didn’t submit the assignment, but should have, and I can check what happened.