FAQ

While this FAQ will cover the majority of Herbarium expeditions, we always recommend reading the tutorial and help text when starting a new expedition. We also have Field Guides which give more specific information about some of the more detailed tasks for these expeditions.

1.) What is Digi-Leap? You can learn more about the overall Digi-Leap process in the project Research section. For CAS Plants to Pixels we'll be utilizing some new Zooniverse tasks for some expeditions. For example the Text From Subject task will allow volunteers to correct blocks of text and then submit them. Volunteers will be presented with the results of OCR from specific specimen labels and then will be asked to correct issues that they find. OCR output can have three types of errors; substitutions, deletions and insertions. The results from Text From Subject will give us better OCR results. In addition, we can use this data to improve future OCR outputs from Digi-Leap.

2.) Interpretation: In general, you should minimize interpretation of open-ended fields and enter information verbatim. This way, we can better achieve consensus when comparing multiple records against one another. However, some discretion would be nice. Here are examples:

Interpretation that you should make: Simple spacing and capitalization errors (e.g. “3miN. of oakland” should be “3 mi N. of Oakland”).
Interpretation you should leave to us: Don’t interpret abbreviations, we’ll sort that out. (e.g. “Convict Lk.”).

3.) Non-English text: While we are currently focused on English language labels, on occasion you may encounter labels in other languages. Transcribe these exactly as written (do not translate to English). Match label content to transcription fields as best as you can. There is a helpful list of common accent marks later in this document.

4.) Spelling mistakes: Transcribe exactly as written, unless you have looked it up and are absolutely certain of a simple spelling mistake. In this case, you can enter the correct spelling. When you make a correction, please use the Done&Talk button to add a comment describing the change; it’s also recommended that you provide a reliable web citation for the change if it’s anything other than a spelling correction of a common word. You can include #error or another relevant hashtag in your comment to flag the type of correction you made.

5.) Problem records: If you come across a problem record that may need to be addressed by a Researcher, or member of the project team, like a faulty image or other problem record, you can flag the record by commenting on it with #error or another relevant hashtag.

6.) Capitalization: Sometimes information may be in all capital letters on the labels. Unless this is an acronym, you should capitalize only the first letter of every word in your transcription (e.g. “PLANTS OF THE GREAT BASIN” should be transcribed as “Plants of The Great Basin”).

7.) Multiple/conflicting information: Some labels may have more than one instance of a piece of information, such as:

  • Habitat: Sometimes the label header will describe a biome or locality that contradicts what is written on the label, for example the label header may say "Plants of the Mojave Desert", but the body text might say "found in salt marsh" , in this case defer to what is in the body text and write "salt marsh" under habitat.

  • Locations: If a specimen is cultivated at one location from cuttings/seeds/rhizomes collected at a different location, only enter the place where the seeds were collected in the Habitat and/or Specimen Description fields and omit the location of cultivation.

8.) Special Characters: What should you type when there is a special character in a text string, such as a degree symbol or language-specific characters? You can do an online search for the symbol or copy and paste it from your word processor’s symbols menu. Some commonly encountered symbols are included at the end of this document.

9.) County: If the county is not stated on the label but other locality information is present, please try to find the appropriate county using our California Place Names document first, if still not successful in finding county, use an online search or other tools highlighted below and leave a source in the optional citation field. However, if there are multiple potential counties for a locality and it can’t be determined which is correct, please choose the Unknown County option from the County dropdown for U.S. locations; otherwise leave County blank.

10.) Splitting Location ,Habitat, and Specimen Description: Often location, habitat, and specimen description terms will be mixed together, even being interleaved in the same sentence. Some simple guidelines when splitting them apart into separate fields to try to ensure consensus:

  • General/non-specific locales are Habitat, and specific ones are Location, as only very rarely is a species found in the one place the specimen was obtained from (examples: “along road” would be Habitat as it describes the environment the plant grows in, but “along Smith Ranch Road” is locality). In general, it is best practice to not repeat information in two fields, but can be done when appropriate for clarity.

Grammar and punctuation standards for Transcription:

  • Don’t introduce punctuation like commas or semi-colons if possible, instead use what is there; if transcribing only part of a phrase it is ok to add terminal punctuation where otherwise there would be none (i.e., a period).

  • You might need to combine several phrases from the label that are not immediately adjacent to one another for certain fields. In that case, separate the phrases with a period.

  • Replace unnecessary dangling non-terminal punctuation as needed with a period. For example, “Dry roadside, east of Tecopa” would result in “Dry roadside.” as the Habitat field, dropping the dangling comma as it is doesn’t terminate a sentence properly.

  • If you are transcribing part of a phrase that is entirely within parentheses, you may omit the parentheses and add a terminal period if necessary. If the phrase in parentheses is transcribed as part of a longer sentence leave them in your transcription.

  • Capitalize new sentences (as in the example above) caused by the split.

Data that goes into Habitat:

  • Substrate: substrate is the description of the soil, or surface on which a specimen was found growing. For example, loam, granite, sand, etc…
  • Associated species: e.g “growing with Fissidens submarginatus, Poa annua etc.”
  • Floodplain describes a habitat. This often occurs with a river name, so for “San Joaquin River floodplain”, include the text in the Habitat field. Since in this case it wouldn’t be accurate to just have “San Joaquin River” in the locality field.
  • Power lines: as they may help narrow a location but say more about the habitat in which the plant grows as power line corridors are usually cleared of larger shrubs and trees.
  • Life Zone: you may see labels mention Life Zones like "Upper Sonoran Life Zone" , these correspond to the Merriam or Holdridge Life Zone classification systems and should be placed in "Habitat". You can read more about them here.

Data that goes into Specimen Description:

  • Any relevant information about the color, height, maturity, inflorescence or additional physical characteristics of the specimen.
  • Information describing plant population like "Local", "Abundant", "Occasional" etc ...
  • Added information in later labels: occasionally in a later determination the scientist will add information about the specimen, i.e., its condition, maturity or number of chromosomes; this should be included after the primary label’s data. Be careful not to transcribe the scientific name on the determination, only any relevant specimen information or (this also applies to other fields as well though it is far less likely to find additional info for them)
  • “n=” followed by a number; this is the number of chromosomes.

Data to exclude that describes Locality:

  • Locality data includes descriptions of county names, geographic regions (e.g San Bernardino Mountains , San Joaquin River floodplain), national parks, town names, road names, or any other landmarks that can be used to identify the location of a specimen. This includes bearings or distance from any landmark if also present in the label text. Ignore coordinates as those will be entered in a separate field.
  • Specific locality information has already been transcribed , therefore it can be ignored and should not be included when possible in the habitat or specimen description fields.

11.) How to transcribe coordinates and altitude:

  • Elevation/Altitude: information will be entered into two separate fields, a text box for entering the just the elevation number, and options for selecting units in either meters or feet. Elevation is a numeric field, therefore it is best to remove non-numeric text like “ca.” , “approx.”, commas or semicolons. This includes commas to denote thousands, for example "3,000" should be written as "3000". However if any symbol like “.” or "," serves as a decimal point and removal would change the meaning of the number they should be left in.

If there is no elevation unit or the elevation unit is illegible, select “unknown” on the dropdown menu. In the uncommon case where altitude is listed in both meters and feet, defer to transcribing the altitude in meters. Lastly, if Altitude is listed as approximate with "about" or "~" and there is no range (min/max elevation) it is ok to write the altitude number as written on the sheet with no modifications.

  • Latitude and Longitude: Traditional latitude and longitude coordinates will each be entered verbatim into two fields respectively, Latitude1, and Longitude1. If there are more than one Latitude and Longitude, the verbatim text can be entered into the fields Latitude2 and Longitude2. See special characters below for how to generate the degree symbol °

  • UTM (Universal Transverse Mercator): UTM is a coordinate system which divides the Earth into 60 zones, 6 longitudinal degrees wide. Often a UTM zone is further denoted with an alphabetic row number to specify its north-south location on the Earth. Within each zone, coordinates are measured with “Northing” and “Easting” in meters. On a specimen label it might look like this:” 11S , 37760370.5 mN. , 365348.2 mE. , WGS84”. This data will be entered into 4 fields, zone (11S), UTMNorthing (37760370.5) , UTMEasting (365348.2), and UTM Datum (WGS84). Note that Easting and Northing are entered in numeric form.

  • Public Land Survey System (TRS): This is the T (township), R (range) and S (section) data used to establish location. For example, SW1/4 NW1/4 S13, T1SR20E refers to the southwest quarter of the northwest quarter of Section 13 of Township 1 South Range 20 East). Quarter sections “1/4” should be written as 3 characters, not one (¼). This information will be entered into 3 text boxes. For example, township(1S), range (20E) and section (SW1/4 NW1/4 S13).

12.) Information to Omit/Skip: The following data should not be transcribed (unfortunately, for the sake of consensus, even if you want to). However if you do find something interesting, feel free to use Done&Talk to post a comment about it.

  • Synonyms listed adjacent to the primary determination (example: for “Cyperus echinatus [=C. ovularis]” only transcribe “Cyperus echinatus”)
  • Common names of species; as many species have multiple common names, some of which are only locally used.
  • Information already entered into one of the dropdown or text fields. For example, if the label indicates “collected on salt flats, -600 ft.", because altitude will already input in ”Elevation - Minimum/Maximum” and “Elevation Units” it should be omitted from the habitat text as this would be redundant.
  • “Collected as part of a survey…” and similar “This specimen was examined as part of a study of…” entries, as it is part of a series of information that relates to annotations of the specimens and is not considered to be core information that we are trying to collect.
  • Hyphens that break a word across two lines. For example “speci-” at the end of a line and “men” at the beginning of the next line would be transcribed as “specimen” without the hyphen.
  • Personal comments by the collector that do not relate to the specimen.

13) “s.n.” as the collector number; this stands for the Latin sine numerum meaning “without number”. In this case you should enter “s.n.” in the Collector Number field.

14) What do I do when there are multiple collections on one herbarium sheet?
You can tell that there is more than one collection on a herbarium sheet by the
presence of multiple barcodes and collection labels.

If a collection with multiple collections on a sheet has fewer accession #s than barcodes, then transcribe the accession number closest on the sheet to the given label and barcode. It is acceptable to have more than one barcode assigned to the same accession number.

When you're working with an image of a specimen that has two or more
collections on it, proceed as follows:

  • Click the “(i) Subject Info” icon that appears below the image, at the lower
    left-hand corner

This is the metadata button – it will provide you with the file name,
which corresponds to our barcode number

  • On the image, find the barcode number that matches the number in the
    file name from the Metadata.

  • Enter the data closest to or in line with that barcode.

  • Proceed through the data-entry workflow.

Use the Done & Talk button if you are having a problem in order to flag
the specimen for the team.

15) Accession Number Issues

  • If there is no accession number select any Herbarium Code and type [No Accession] in the text box.

  • If there are more barcodes than accession numbers on a multi-specimen sheet , enter the herbarium code and accession number closest to the designated barcode and label. It is not uncommon for a single accession number to be assigned to more than one barcode.

16) Specimens with no labels
If a specimen has no labels at all, then proceed without filling in the text fields and answering no to multichoice questions then click Done.

Some Useful Tools (discovered or developed by Notes From Nature users)

Counties and Cities: Good tools for finding counties etc. are lists on Wikipedia, there is a list of every county in California. (there are also similar lists for other states). For example, https://en.wikipedia.org/wiki/List_of_counties_in_California (via the linkbox you can also change the state). If you see an unfamiliar town or landmark you can find it in this list of California place names: https://drive.google.com/file/d/1ijX9NtL-YMh4BFBhYYqhzs0RbzcJ_rvp/view?usp=sharing

Mountains: https://en.wikipedia.org/wiki/Category:Lists_of_mountains_of_the_United_States
Uncertain Localities: The GeoNames geographical database https://www.geonames.org/
Mapping tool with topo quads: To find uncertain counties or localities http://mapper.acme.com

Hard-to-read text: Use “Sheen”, the visual webpage filter, for some hard-to-read handwriting written in pencil. (Tip was from the War Diary Zooniverse project) https://chrome.google.com/webstore/detail/sheen/mopkplcglehjfbedbngcglkmajhflnjk?hl=en-GB

Special symbols: You should be able to find symbols in word or by doing an online search and copy and paste. Here are a few:

  • degree symbol for coordinates: °
  • plus minus: ±
  • non-English symbols: Ä ä å Å ð ë ğ Ñ ñ õ Ö ö Ü ü Ž ž

Other symbols may be found on Penn State’s Symbol Codes: Accents, Symbols and Foreign Scripts page: http://sites.psu.edu/symbolcodes/codehtml/

ClipX: Freeware Windows clipboard enhancer that saves the last 1,024 items copied to the clipboard and allows them to be pasted through its icon in the system tray. Nothing short of a lifesaver for Ornithology but quite helpful in Herbarium too: http://clipx.en.softonic.com/

Dates: If all parts of the date are written with numerals and it’s unclear which part is the day and which is the month (for example, 2-4-91) https://en.wikipedia.org/wiki/Date_format_by_country identifies which date format (day-month-year or month-day-year) is commonly used in each country.