JGFF "To Do" List: ~~~~~~~~~~~~~~~~~ A non-ordered list of ideas and suggestions, for future enhancements to the JewishGen Family Finder (JGFF). There are many many months worth of work here, as well as some items which will also require ongoing care and maintenance. We don't expect all these suggestions to be implemented in the near future. This file is just a place to try to collect all of these ideas, so they don't get lost. Warren Blatt, October 2000 Last Updated: May 7 2004, Dec 13 2004, Jun 30 2009. ------------------------------------------------------------------- 1 - JGFF Town-name Cleanup. Apply town name changes/corrections for the 65,000 user entries which were made during the non-monitored period of 1998-1999. This is a big, important, multi-step job, which will make use the town data that was corrected by our "country experts" in early 2000 via the "JGFF CleanUp" system at < http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgffclean2~xxx >. Many questions linger, i.e. do we continue to use the place names from the circa 1990 USBGN / "Where Once We Walked" / Soviet-era Russian-language names, or the newer 1996 USBGN / WOWW2 / post-Soviet Ukrainian, Belarussian and Moldovan native language names? Most probably the latter. The "JGFF Cleanup" system used the earlier names, so applying those changes is only a first step in this multi-step process. In any case, we most probably want to wait for the revised edition of "Where Once We Walked" to be published (expected in late November 2002) before making a final decision, and should try to keep the JGFF data in sync with the most recent WOWW. Also, we should roll out the new JGFF town names in sync with rolling out the NEW ShtetlSeeker... (preview version up at < http://www.jewishgen.org/ShtetlSeeker/loctownexp3.htm >), which also contains the newer town names. Ideally, all three (new WOWW, new ShtetlSeeker, new JGFF town names) should be done at the same time, to keep everything in sync. See items #1a and #1b below. At the same time that we convert the town names, we should take this opportunity to add the "USBGN Feature Code" as a new column in the JGFF database, for all unambiguous towns. This is part of preparation for ShtetlMaster (see #1b, #7, #8, #9 below). ------------------------------------------------------------------- 1a - JGFF Town-name Cleanup -- Sync with WOWW2 for Bel, Ukr, Mold. As a major step towards town-name cleanup, we will need to convert all locality names in Belarus, Ukraine and Moldova from their Soviet-era Russian-language names to their post-Soviet Ukrainian, Belarussian and Moldovan native language names, as contained in the Revised Edition of "Where Once We Walked" (2002) and the new ShtetlSeeker. What we would ideally need in order to most easily accomplish this task would be a table of WOWW data, containing: - Old Name (Town + Country) - New Name (Town + Country) - USBGN Feature Code (or Lat/Long) for each JGFF locality in Belarus, Ukraine and Moldova (and potentially other countries: Russia, Czech/Slovakia, former Yugoslavia, etc.). The work done by Michael Tobias and Alex Sharon for the revised "Where Once We Walked" (henceforth, "WOWW2") likely contains exactly this information, and Gary Mokotoff has verbally agreed to lend us this data for the specific one-time purpose of updating the JGFF, in order for it to be in sync with his new publication. Using this table, we should be able to automate converting the *majority* of the place names in the JGFF. The problematic areas are two: - When there is more than one town with the same name in the "Old Name" column. - When there is a town name in the JGFF with no matching entry in the "Old Name" column. In either of these cases, the easiest thing to do is to simply leave the existing JGFF entry as is. Some instances could be investigated and 'guessed' with relative accuracy, e.g. when there are several towns with the same name, and one is a very large town and the others very small towns (with regard to Jewish population), then assume that the entries refer to the larger town. (Also see Item #47). The OldName should then be added as a synonym for the NewName, so that if someone tries to enter the OldName, it is automatically converted into the NewName. (Also see items #8, #56). Exceptions: If the OldName is ambiguous (i.e. there's more than one town with the same name), or if the OldName is identical to a NewName for a *different* town, then do NOT add a synonym pair. ------------------------------------------------------------------- 1b - JGFF Town-name Cleanup -- Add "USBGN Feature Code" column, Move town names to separate table. We should also take this opportunity, during the town name conversion, to store the "USBGN Feature Code" in the JGFF database, in a new column, for all unambiguous towns. This is part of preparation for ShtetlMaster (see #7, #8, #9 below), and the key to all future development. In a truly normalized database, the JGFF's DATA.DBF file would contain *only* the "USBGN Feature Code", instead of the redundant town and country name columns. The town and country names would all be in a separate table (hereafter called "TOWNS.DBF") -- and thus each locality name would appear only *once* in the JGFF tables, making the maintenance of town names much easier. The association between the two tables (DATA.DBF and TOWNS.DBF) would be via the "USBGN Feature Code" number. While this separation might make the search algorithms more complex and/or time-consuming (there's an additional level of lookup indirection required -- an SQL "JOIN" between tables), it will make things much cleaner (see related Items #2a, #63). Associating the USBGN Feature Codes for all localities in the JGFF will not be possible in the short term (see Item #1a above), so this implementation might have to be phased -- not ALL localities would initally have a true USBGN Feature Code. For ambiguous or undefined localities (those not in the ShtetlSeeker), perhaps a special indicator (e.g. "0" or "-1") should be used in this new column -- or better, use a sequence of unique numbers which are out-of-range of those used by the ShtetlSeeker USBGN data -- these could be considered to be "artificial USBGN codes". [There are 20,000 unique localities in the JGFF. 70% of the data is for East/Central European countries covered by the USBGN data. The remainder of the localities can probably be consolidated (e.g. typos, US cities with and without the state name, etc.). Thus we probably need a range of less than 10,000 "artificial USBGN code" numbers for the "non-ShtetlSeeker" localities]. Or perhaps use a alphabetic prefix for the "artificial USBGN code" numbers (if this isn't a numeric column); or have a separate column in the TOWNS.DBF file for non-USBGN localities. The USBGN Feature Code column should then become a REQUIRED input whenever a 'new' town (i.e. a town not yet in the JGFF) is added by a JGFF editor. This will ensure that the town data remains clean. This will require a new administrative interface for inserting new town names, replacing the current "!YES!" suffix method, with one that includes the USBGN Feature Code as an input. This should also eliminate the need for the "FORCE", "ACCEPT" and "NOCHANGE" flags in the "Gotcha" column in the JGFF data records -- the USBGN Feature Code column will supplant it. Here are the current and proposed data structures for DATA.DBF: Current DATA.DBF columns: - Code -- JGFF Researcher Code # - link to RESEARCHER.DBF - Surname -- Surname being researched - Town -- Town being researched - Country -- Country containing town. 3-4 letter abbrev. - Source -- No longer used - Gotcha -- Contains a numeral, 1-5 (or blank for most) - LastChange -- Date last updated New DATA.DBF proposed columns: - Code -- JGFF Researcher Code # - link to RESEARCHER.DBF - Surname -- Surname being researched - TownID -- USBGN Feature Code -- link to TOWNS.DBF - Gotcha -- ??? Still needed ??? - LastChange -- Date last updated Basically, the "Town" and "Country" (and possibly the "Gotcha") columns have been replaced by the new "TownID" column, which contains the USBGN Feature Code, being a link to the new TOWNS.DBF file. The old "Source" column is also eliminated. The revised DATA.DBF file should be half of its current size. The new TOWNS.DBF table would contain, at a minimum: - TownName - Country - USBGN Feature Code -- link to ShtetlSeeker database Additional fields in TOWNS.DBF might contain the unaccented or display version of the town name; town name synonyms; and linkages to other JewishGen databases (Yizkor Book Bibliographic database, etc.), perhaps via boolean flag fields or bitmasks, as part of ShtetlMaster. (Or perhaps these linkages are better accomplished via separate function-specific tables, each keyed by USBGN Feature Code). But I hope that we can avoid having any data in TOWNS.DBF be duplicative with the ShtetlSeeker database (i.e. accented versions of the town name, town synonyms) -- Each piece of data should ideally appear only once throughout the system. Note that TOWNS.DBF is an "intermediate" table, which sits between the DATA.DBF file and the ShtetlSeeker table. We can't use the ShtetlSeeker table directly, because: - We need to limit the set of towns searched to be *only* those towns in the JGFF. - Some entries in the JGFF are not in the ShtetlSeeker: + Unverified towns (i.e. older entries; unverifiable names). + Towns not in Eastern/Central Europe. + Region names (Provinces / Gubernias / States, etc.). + Ambiguous town names with qualifiers (See Item #47). 9/19/2002, 11/11/2002. ------------------------------------------------------------------- 1c - JGFF Town-name Cleanup -- Update jgffsyn2 database. Once the revised WOWW2 is published and we've converted the town names in the JGFF to the new names, we should also update our internal "jgffsyn2" database of town-name synonyms, so that this tool is in snyc with WOWW2 and the new town names used in the JGFF. We will need to obtain this data from Gary, as we did in the past. ------------------------------------------------------------------- 2 - Document Advanced Search Features. There are several undocumented search features, notably the use of "*" and "?" wildcard characters in Exact Spelling mode, and the use of bracket [] characters in D-M Soundex mode for specifying exact character positions. Add these to the JGFF FAQ, and also create an InfoFile describing all the various search options, with examples. Link this InfoFile from the "Search Type" pull-downs in all JewishGen databases, since these options apply to nearly all databases. A very rough draft of "Search Types" InfoFile is at < http://www.jewishgen.org/infofiles/SearchTypes.htm >, 6/2002. [Also see < http://www.avotaynu.com/csi/csi-home.html > for a brief description the advanced D-M Soundex bracket enhancement]. However... We don't want to publicly document these features YET -- because of some bugs and potential data-mining issues. See Items #2a and #2b below. ------------------------------------------------------------------- 2a - Bugs in wildcard searches. There are some definite quirks with the wildcard searches (the use of ? and * characters in Exact Spelling Match mode). They don't handle blank spaces (multi-word towns) well. For example, searching for town - "???*g", "Russia" will NOT find "St Petersburg" You need to search for town - "?? *g", "Russia" to locate these (note the space). That one doesn't seem so bad. But... searching for town - "S??* *g", "Russia" (note the space) yields some very strange results... Towns that start with "S", and some SURNAMES that end with "G"! Separating the Towns and Surnames into separate tables (see Item #1b above) might help resolve some of these issues. Multi-word towns and Partial Text Mode: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The code should handle multi-word towns better. In Partial Text mode, it does NOT locate text in the second word. Hopefully, it could work similar to the JRI 1929 Polish Business Directory Town Index, where multi-word town names are handled very well. 7/18/2002 *** These issues were resolved 11/5/2002. *** Need to do a thorough regression test. ------------------------------------------------------------------- 2b - Add restrictions to wildcard searches. We need to add some more restrictions to the wildcard searches, to prevent data-mining. We currently prevent the use of the "*" wildcard within the first three characters, but we have NO restrictions on the use of the (currently undocumented) "?" single character wildcard. By using the queries "???*a", "???*b", "???*c", "???*d", etc., someone can mine nearly the entire database in just 26 queries. We need to limit the use of the "?" character -- perhaps allow only ONE of these within the first three characters. (Also see related issue, Item #5a.) 7/18/2002 ------------------------------------------------------------------- 3 - US States -- Enforce use of two-letter state/province codes. For all entries of localities within the USA and Canada, enforce the use of the two-letter state/province code abreviation, after a comma and space. Too many entries do not specify a state name, or spell out the state name, or abbreviate the state name incorrectly, or omit the comma, or omit the space, or other erroneous permutations -- all making the entries ambiguous, incorrect and/or inconsistent. There are various ways to implement this enforcement... Here are some ideas, presented from the simplest to the highest degree-of-difficulty: a) We should programmatically require that all town entries for the Countries "USA" and "Canada" have the sequence comma, space, and two alpha characters as the last four characters in the Town field. This would be a simple syntax check. ** b) A further step would be enforcing only *valid* state and province abbreviations (i.e. the users' input must match one item in our set of known states/provinces), as per < http://www.jewishgen.org/InfoFiles/codes.txt >. (also available programatically in the file "/htdocs/CURE/statelist.txt"). c) A further step would be the automatic conversion of full state names and erroneous abbreviations into their proper two-letter codes (e.g. "Miami, Fla" -> "Miami, FL"; "Boston, Mass" -> "Boston, MA"). d) Even further -- develop a locality validation scheme for the USA and/or Canada, using some off-the-shelf database (online, or local), to validate the actual town names. This is a larger longer-term project. For the time being, we can probably rely on the existing JGFFTowns entries to validate the new incoming entries. ** There are two EXCEPTIONS to this rule: 1) The names of the 50 U.S. States and the 13 Canadian Provinces ARE allowed as valid entries. We do premit state-wide entries, e.g. "Ohio, (State)", "Massachusetts, (State)", "Ontario, (Province)", etc., WITHOUT the need for a two-letter abbreviation suffix. We would need an internal table containing the names of the 50 U.S. States and 13 Canadian Provinces, as per < http://www.jewishgen.org/InfoFiles/codes.txt >, to be checked before enforcing the two-letter suffix rule. OR... since all of these 60+ "(State)" and "(Province)" already exist in the JGFFTowns table, we could let the table lookup validate these first. 2) The global wildcard term "Any" is allowed in the Town field. An algorithm to implement this might be: 1) First check the input Town entry against the JGFFTowns database, as we do currently. 2) If the input Town name is not found in the JGFFTowns database, AND the input Country is USA or Canada, then do the additional syntax check: If the last four characters are NOT comma + space + two-alpha-characters, then return a different error message. The new error message could be something like: To specify an ancestral town in the United States or Canada, you must indicate the state or province, using its official two-letter postal abbreviation. The state or province abbreviation should follow the town name, after a comma and a space. For example: "New York, NY", or "Toronto, ON". See more below under Item #27, "Town name validation for places outside Eastern Europe". ------------------------------------------------------------------- 4 - Accents. Allow the input/display/handling of accented characters. This will more accurately reflect the native town names, for those countries that use the Latin alphabet. The German researchers appear to be particularly affected by the absence this feature... They love their umlauts ;8-) When searching, unaccented characters should match accented characters (as Inktomi and most other search engines do). Our JRI-Poland and ShtetlSeeker databases have this feature. We could perhaps implement this in phases, one character set at a time. Do the easiest one first -- the Western European character set, which includes German and French. The harder Slavic and Baltic character sets could come later. Another phased approach could initially allow for the *display* of accented characters only, and not their input. Again, this is similar to the approach taken by the JRI-Poland and ShtetlSeeker search engines. ------------------------------------------------------------------- 5 - Display search results in batches, for very large results. Break them into smaller batches, with a "Get next batch" feature, as is already done for other large datasets on JewishGen. Reason: Very large result sets (e.g. all surnames from "Kiev" or "Vilnius" or "Wien") which return THOUSANDS of matches, can overflow a browser's memory, or take an unreasonable amount time to download and display, especially for users with older browsers / slower modems / limited memory, etc. This issue is growing as our database continues to grow, and will become even more problematic if and when proximity searches are implemented (see item #9 below), and wildcard searches are documented (see item #2 above). 250 or 500 names is probably a reasonable default batch size. ------------------------------------------------------------------- 5a - Upper limit on total number of matches. We probably should also implement an ultimate upper limit on the number of matches that can be returned from the JGFF in a single query -- just like we have for other databases -- to prevent data-mining. I propose that no more than something like 3,000 matches or 1% of the entire database should be returned in any one query. No researcher needs to see over 3,000 matches for their own *personal* research! Queries such as searching for Town = "Any" in some countries (e.g. Poland, Russia, Germany) can currently return over 5,000 matches. A D-M Soundex search for Surname = "Cohen" currently returns over 2,750 matches. These are easy target data mines. If a query results in over 3,000 matches / 1% of the database, an error message should be returned to the user, giving the total number of matching entries, and suggestions for filtering their search so they can see some of the matching entries. 9/2002 ------------------------------------------------------------------- 6 - Display all of a fellow researcher's entries. We currently allow only each researcher him/herself to see their entire list surnames/towns. We might want to also make this list available to other researchers, perhaps via a hyperlink from the researcher name/code in the far-right column in the search results screen. But there might be some privacy issues here... to be considered. One user wrote: "Often I have found a researcher looking for a common name, like mine. If it was possible to see other names that the researcher had listed, it would allow a better possibility of determining if there was a connection. If privacy is a concern, this could be configured as an allowable option for each user." ------------------------------------------------------------------- 7 - ShtetlMaster linkage - ShtetlMaster Locality Page. "ShtetlMaster" is a long-term project, to create hyperlinks from all town names to an auto-generated "ShtetlMaster Locality Page" for that town, and vice-versa. For details, see < http://www.jewishgen.org/projects/desc/ShtetlMaster.html >. This will need to be implemented gradually, probably one country at a time, as the authorative ShtetlMaster locality database is developed for each country. Note that not ALL localities in the JGFF will be linked -- only the major Jewish communities -- which should be the majority of entries. Entries for the smaller localities will remain unlinked to ShtetlMaster. See < http://www.jewishgen.org/projects/desc/ShtetlMaster.html > for details. The first phase implementation of the "ShtetlMaster Locality Page" can be fairly simple, consisting of data that we already have. Each town in the JGFF that is tagged with a USBGN Feature Code (see item #1a and #1b above) would have a hyperlink created from every instance of that town name in the JGFF, linking it to the dynamically-generated "ShtetlMaster Locality Page" for that town. The URL of a "ShtetlMaster Locality Page" would be of the form < http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~shtetm~12345 >, where "12345" is the USBGN Feature Code of the locality. Each town's "ShtetlMaster Locality Page" will contain the basic information for that town -- which can all be generated from the new ShtetlSeeker database: - Modern native town name, modern country name; - Alternate and historical names for the town; - Latitude and Longitude; - Distance/Direction from the country's capital city. - Maps: either a small auto-generated map of the town's region, centered on that town; or just links to MapQuest, Expedia, and MultiMap, as we have in the new ShtetlSeeker results. Additional 'easy' features for the "ShtetlMaster Locality Page" could be: - The number of current JGFF entries for this locality (as the JGFF data entry pages have, when presenting choices for invalid town entries); - A button to get all JGFF entries for that town (as ShtetLinks pages have). Perhaps combine with above; - Buttons for other databases (e.g. Yizkor Book Bibilography?); - Links to appropriate SIG home page(s), based upon country. Additional 'hard' features would require the compilation of an additional data and programming: - the town's political jurisdictions, various time periods. - the town's Jewish population figures, various time periods. - names of the nearby Jewish communities (within 10-20 miles, or the nearest 5-10 communities), with each hyperlinked to its respective ShtetlMaster Locality Page, similar to a feature in "Shtetls of Belarus"). - link to generate a fuller list of nearby Jewish communities, similar in form to "LDS Microfilm Master" results display: < http://www.jewishgen.org/databases/ldsdist.htm >. - links to JewishGen resources for that town -- (Yizkor Book bibliographic database and translations, ShtetLinks pages, etc.). - links to JewishGen resources for the containing regions (gubernias, countries, etc.) -- SIGs, etc. All these are part of the overall ShtetlMaster plan, which can be implemented gradually. The initial "ShtetlMaster Locality Page" could even be a small popup dialog window, rather than a full-blown webpage, or come up in a separate window (like the FTJP's Family Display Page). ------------------------------------------------------------------- 8 - Town Synonyms - ShtetlMaster. If someone searches for "Kovno" or "Chenstahov", currently the JGFF returns "No matches found". A major enhancement would be to search the ShtetlMaster synonym database, and map the user's input to the appropriate modern town name. Note that this would work for "Exact Spelling" searches only -- D-M Soundex would likely yield too many matches (probable false positives). When doing a D-M search, perhaps the user could be presented with a *list* of matching town entries from the ShtetlMaster synonym database (with some basic info about each locality: e.g. the primary native name, its synonyms, and the number of matching JGFF entries), and could then select from that hyperlinked list, to see the corresponding JGFF entries. The ShtetlMaster synonym database could also be used in other parts of the system, such as data entry, when showing possible town name matches for rejected entries (see Item #50); when sending annual renewal notices (see item #44); when giving hints for "No matches" searches (see item #62), etc. These synonyms become more important once we start using the new post-Soviet native Belarussian, Ukrainian and Moldovan town names (see Item #1a). No would ever think of searching for "Hrodna" for Grodno, or "Vawkavysk" for Volkovysk, etc. -- some of these new native names don't have the same D-M Soundex code as the old name. Also see item #56, Synonym Manager Utility. ------------------------------------------------------------------- 9 - ShtetlMaster - Proximity Search. Once towns are tagged with their USBGN Feature Codes (which map to latitude/longitudes), we can then implement a "show me all LEVINs within 50 miles of Warsaw" feature. We would have to limit the radius distance for "town only" searches, to avoid returning too much data (i.e. "show me everyone researching any family within 999 miles of Warsaw" should not be allowed), to prevent data-mining (see Items #2a and #5 above). Other possibilities might be the capability to find all matches within a particular district, gubernia, or other geo-political jurisdiction -- once that jurisdictional data is added to the ShtetlMaster database. This feature could be restricted, to be available only to donors to JewishGen, as one of the "value-added services". This could be implemeneted by using ASP -- converting the search page "/jgff/jgffweb.htm" to be "/jgff/jgffweb.asp". The "proximity search" inputs would be greyed-out for non-donors, as an incentive/teaser to encourage them to donate. ------------------------------------------------------------------- 10 - Display the Researcher Code number more prominently in the "Your JGFF Data" edit screens, and elsewhere. Reason: People seem to be always forgetting their Researcher Codes... or insisting that "you never gave me one". The JGFF Password Desk (password@jewishgen.org) is flooded with requests daily. If their Researcher Code number was displayed to them with every instance of their data, perhaps there would be less "You never gave me a Researcher Number" messages. * Partial implementation was completed 5/2001. ------------------------------------------------------------------- 11 - Break the Researcher Address info into more specific fields. For instance, divide the current "City, State or Country" field into separate "City", "State/Province", and "Country" fields. Separate the current "Full Name" field into distinct "Surname" and "First Name" fields. The latter would force people to enter their full names, and we'd know which name was which (surname vs. given name), to better identify users. This would also enable us to better detect duplicate researcher entries (see #26 below). Having a separate "Country" field would allow for a better display, would force people to enter their country name (which some people omit), and could give us better statistics about where our users are... as well as help in providing the proper required input format for people-search engines. The "Country" field should be a pull-down, rather than a data input field that people type into... because of the high variability data entry possibilities (e.g. "US", U.S.", "USA", "U.S.A.", "United States", "America", etc.). As other sites do, list the most likely countries at the top of the pull-down (USA, Canada, England, France, Israel, etc.), where the majority our users are, and then list all the remaining countries alphabetically. However, retro-fitting all the existing address data into the new fields would be very difficult without a LOT of manual editing. But we should probably start this for NEW researcher entries. *** This was implemented 9/2001. There are now new distinct GivenName, Surname, Town, State, and Country fields in the JGFF Researcher database. This is enforced for all NEW entries, but the existing data was NOT retro-fitted. An announcment was posted 10/16/2001 to all groups, asking existing JGFF users to edit their address record into the new fields, but the response rate was minimal. *** Another notification to all users was made 11/2001, via the first "Point.to-Point" newsletter, and it provided explicit instructions: < http://www.jewishgen.org/jgff/jgffupdate.html >. *** We have now manually retro-fitted all users' data into the new fields. ------------------------------------------------------------------- 12 - Disallow/Restrict spaces in the Surname field. Quite often, researchers misunderstand what the JGFF is, and enter *given names* in the "Surname" field, along with the surname. If we disallowed spaces in the "Surname" field, we could prevent this. However... there are a few Jewish surnames (Spanish / Italian / Dutch) that actually DO contain spaces, so we can't easily do this without upsetting a few people. I think that the best solution to this is to have a specific set of allowable prefixes with space (i.e. "de", "la", "von", "der", "ben", etc.), and ONLY those prefixes should be allowed. This small set of valid allowable prefixes could probably be determined by scanning the current JGFF surnames for entries which contain spaces, eliminating the questionable and bogus entries, and using the remainder as the set of valid allowable prefixes. Another, simpler approach (though less perfect): Just issue a warning message if the Surname field contains any spaces. ALL other invalid characters (punctuation, numbers, etc.) should be automatically rejected. Only the 26 letters of the alphabet should be allowed... plus the space (in the context noted above), and perhaps the hyphen. We also need to review ALL existing data in the DATA.DBF file for surnames which contain spaces. Many are erroneously entered given names -- or people who have entered two or more surnames using the word "or" (e.g. "Cooper or Coopers"). These all need to be manually cleaned up. Also scan for and eliminate all other invalid characters (slashes, question marks, parentheses, numerals, etc.), which people have entered. The system is still accepting question marks and other punctuation marks as valid inputs in the surname and town fields. This should be programatically disallowed. (See item #52). ------------------------------------------------------------------- 13 - Internal: On the "Search JGFF Researcher Names" at http://www.jewishgen.org/jgff/jgffweb2.htm, in the results page, it would be nice if one were able to go directly to that researcher's "Modify" pages via a hyperlink. ------------------------------------------------------------------- 14 - Printed version the JGFF. Some JGSs want a printed paper copy of the JGFF, like the version that Gary Mokotoff and then the New York JGS published before JewishGen took over the JGFF in 1996. Having a printed version of the JGFF is still a very useful thing to have for the JGSs, since more people can easily access it at JGS meetings and other venues. When we took over the JGFF from the New York JGS, we did promise to them that we would continue to produce a printed version of the JGFF. I believe that we did produce one or two printed versions in 1997 and 1998, and distributed them to various JGSs at cost (about $50 each). But the JGFF is now too costly to print and ship -- since the JGFF is now 5 times the size of the last printed version (June 1998). A good option would be to produce a read-only PDF version, and let the JGSs individually download and print it as they need. This takes all the printing/shipping costs out of the equation for us, and all the costs are borne by the JGSs. We just have to come up with a good data format (following the style the previous printed versions the JGFF -- Gary Mokotoff might be helpful here). Distributing the PDF printed version on CD-ROM is another idea, since downloading it might be prohibitive. We do have one new problematic area: Researchers who have opted to not display their names/addresses, who are contactable only via the new "blind contact" email system. We can not print the contact information in the printed version, to maintain their privacy. Users of the printed edition would still have to go online and use the "blind contact" system in order to reach these researchers. ------------------------------------------------------------------- 15 - "Any Country" town entries -- clean up. There are some 40+ "Any Country" entries which have town names, which are all invalid. The only allowable town entry for Country = "Any Country" is Town = "Any". The existing bad entries should be cleaned up, and such entries should be disallowed in the future, programmatically. 9/20/2002 -- I cleaned up all of the existing "Any Country" entries with town names. I think that the system disallows future entries in this category... DONE. ------------------------------------------------------------------- 16 - Multiple Button press suppression. If a user presses a "Submit" button more than once (such as a quick double-click), multiple requests to the server can be generated, which can result in some VFP locking problems. We have some JavaScript code that can be used to prevent this. We've already implemented it on the "Add Researcher Details" button when creating a new JGFF Researcher... but we should also add this JavaScript code to other parts the JGFF system as well... namely updating the surnames/towns. One example of how to do this in JavaScript can be found at < http://www.dynamicdrive.com/dynamicindex11/submitonce.htm >. ------------------------------------------------------------------- 17 - JGFFHelp Form. When many people write in to the JGFF Help line at , they often neglect to include important information: their name, their JGFF Researcher Code number, the source their information, what country, etc., etc. If we could have a fill-in *form* instead just a simple "mailto" URL, and the form had input spaces which prompted them for this information, we could serve them much better -- without us guessing or having to go back and ask them to supply the missing information. We could then replace all hypertext links to throughout the system with a hypertext link to this form. The new JGFFHelp form would be similar to the Tech Support form I created at < http://www.jewishgen.org/JewishGen/TechSupport.html >. ** This was partially implemented December 2000. I created an intermediate explanatory "JGFF Help" page, < http://www.jewishgen.org/jgff/jgffhelp.htm >, and replaced ALL mailto hyperlinks to throughout the system with links to this page. This has been very successful -- JGFFHelp now receives messages with much more complete information. We should continue implementation by creating a web form, to ensure that they fill out all required information fields that we need to assist them. ------------------------------------------------------------------- 18 - "Email Details are Suspect" message removal. When a researcher edits their email address, if there was previously an "Email Details are Suspect" notation placed there by the JGFF administrator (i.e. a "bounced email" note), this notation should be automatically removed -- (even though we don't really know if the newly-entered email address is a good one), so that the researcher can proceed. 7/2002 -- This also perhaps needs to be coordinated with the Goldmine integration. ------------------------------------------------------------------- 19 - Upgrade to the latest versions VFP / WebConnect. The version of West Wind WebConnect that we're currently running on www.jewishgen.org (WC 2.5) is about five years old (1997). We are missing their bug fixes, new features, and performance improvements. Upgrading to a 4.X version of West Wind Web Connection will require some code changes, and it would also require changes to other VFP programs on the same server. So perhaps this could be implemented by moving all JGFF-related programs onto a separate server. This should also be done for all other JewishGen database systems. (See < http://www.west-wind.com > for info on West Wind Web Connection). ------------------------------------------------------------------- 20 - Ideas about using virtual disks (RAM disks). To improve performance. This is an (expensive) advanced feature, once we have a Gig or two or memory to spare, to dedicate solely to this function. We currently have problems where large searches time out (notably the administrator's name search at < http://www.jewishgen.org/jgff/jgffweb2.htm >), and using RAM disks might be something that would help. At JGFF initialization time, the entire JGFF database would need to be read from physical disk into the RAM disk (slowing down initialization, but speeding up all subsequent access), and then all searches would be done from the RAM disk. An issue would be updating both physical and RAM databases in parallel, when edits/updates are made. Another (easier) idea is simply to upgrade various hardware and software components: faster CPUs. more CPUs, new motherboard with faster bus, faster disks (15,000 rpm), etc. -- or to move some of the database programs onto a separate machine, to a different machine than the web server. ------------------------------------------------------------------- 21 - JGFF Alerts facility. An alerts facility. By registering with a "JGFF Alerts" facility, the researcher will automatically receive an email message whenever someone else adds a new surname/location pattern which matches the researcher's own specified pattern(s). This will save researchers from having to periodically search the database. We should probably reserve this feature as a "members only" premium pay service. To avoid generating lots unwanted alerts forever, and to keep the database's addresses fresh, the alerts facility should be given a limited specified lifespan. At the end the lifespan (say 6 months, or a year) the researcher will be sent a message saying that the alert is about to expire and that if they wish they can renew it. However, this service would present its share issues: - Performance: It would slow down the system a bit, because *every* time a new entry is added, we'd need to search the JGFF database for matches, and then send any email alerts. - Types matches: We'd obviously have to limit the TYPE matches that would be checked. We shouldn't allow a general "Minsk" query -- otherwise every time that someone entered new data for Minsk, we'd have to send 2,000 emails. But for smaller towns, this would be useful and reasonable, thus it would be nice to allow it (i.e. "every time anyone adds/modifies any data for Simnas, I'd like to be alerted"). Limiting it to surname AND town would produce a reasonable small number matches, and much more likely to be relevant... But do we match on exact spelling, or D-M Soundex? I think that the best compromise is to do a *exact* match for the town name, and soundex match on the surname. - Bounces: Yet more email bounces to deal with. - Researcher Table: Add new column, to contain the alert service's expiration date. Would require ongoing care and maintenance; documentation; an administrative interface; billing/renewal notice tie-in, etc. *** Michael completed a test implementation 6/2001. - Beta testing began 10/16/2001. - Offer announced 11/2001, for $100 donors. - See < http://www.jewishgen.org/jgff/jgff-faq.htm#q3.7 > for user documentation. Also see: Item #21a, re JGFFAlerts Enhancements; Item #61, re listing status on Researcher Info page; Item #64, re "show all my matching entries" web feature. ------------------------------------------------------------------- 21a - Enhancements to JGFFAlerts facility. Users have suggested various enhancements to the JGFFAlerts. Namely, some would like to be notified everytime anyone adds *any* reference to their town OR surname, instead of the "AND" relationship we've currently implemented. These towns and surnames would need to be selected individually, and would therefore require a brand new user interface in order for the user to select what type of notifications they'd like for each surname/town pair: - Surname AND Town (current mode) - Surname (Exact Spelling, or Soundex) - Town (Allow exact match only) I think that we would still like to have some limitations... e.g. Do we really want to permit users to be notified every time someone adds/edits any entry for "Warszawa", "New York", "London", "COHEN", "LEVIN", etc.? The number of emails generated could be enormous. I'm not sure how to easily resolve this one. Also see Items #61, #64. 9/30/2002 ------------------------------------------------------------------- 22 - Contact all researchers for a town. A feature to allow a researcher to easily send an email to ALL researchers found as matches in the search results page. Currently, you have to cut-and-paste each email address individually into an email message. And for 'blind contact' researchers, you need to do them one at a time. (Suggestion from a user, 5/2001) Having a "Contact All" email form would make this easier... although there's also more potential for abuse... so this would be a policy decision. ------------------------------------------------------------------- 23 - Internationalization. Full-scale internationalization the JGFF into languages other than English. This includes all JGFF web pages, documentation, words and phrases generated inside the program code, such as column headings, error and warning messages, etc. A very difficult and time-consuming effort to do thoroughly. Would need to put all text 'snippets' into a separate file or database/spreadsheet table, keyed by language code, and then provide that spreadsheet to a translator, who would translate the words/phrases for that particular language, in a separate column, and return to us for integration. As a start, have the JGFF-FAQ document translated. < http://www.jewishgen.org/jgff/FAQ >. * 1/2002: We now have some translators, and have begun the translation of the JGFF-FAQ documentation: - Spanish - Carlos Glikson - completed 2/2002. - Russian - Mark Dworkin - completed 2002. - Portuguese - Max Blankfield - French - Viviane Berneman Ship - Aug 2004. We could key the language code off of the country-of-residence in the user's CURE record. Some of the current HTML pages could be converted into ASP pages, and the ASP pages could lookup and fillin the terms in the appropriate language. ------------------------------------------------------------------- 24 - Email address validation. People often mis-type their email address when registering in the JGFF... and then there's no way that anyone can contact them. The bad email addresses fall into several categories: 1) Some of them are simple typos, which we could catch by having them enter their email address TWICE, and ensuring that both versions match. See sample JavaScript code in < http://www.jewishgen.org/listserv/test.htm >. 2) Some are people who don't know their own email address, or don't know what an email address is... (e.g. WebTV'ers, AOL'ers). We can help them via some simple syntax checking, i.e. ensure that the address contains alpha-characters, then an @ sign, then more alpha-chars containing at least one period. More sophisticated syntax checking is very difficult. This checking needs to be done on the backend, since front-end methods such as JavaScript won't work for AOL and WebTV browsers -- the people who need it the most. (I've implemented this in the CGI Perl scripts for support and mailing list subscriptions). 3) People who purposefully provide a bogus email address. These are quite difficult, if not impossible, to detect, due to the way that Internet email works. There are some "email address vetting" facilities out there, including one by Elsop, whose LinkScan product we recently purchased. None them are perfect, but would at least help weed out a FEW bad addresses. The only true way to actually validate an email address is to send an email(!), and get a response. A more radical scheme would be to not allow access until we send an email to that address, and receive a valid response from them. There are other websites and software that use this method of authentication. This would change our paridigm greatly, and would be a bit more inconvenient for users -- but it would weed out most of the bogus user entries, and would create a better user database in the long term. (Also see related Item #49, on users with no Surname/Town data). ------------------------------------------------------------------- 25 - Merge users -- Confirmation page, and Dupe deletion. When merging two JGFF Researcher Code records (an administrative function, at < http://www.jewishgen.org/jgff/jgffmove.htm >), any duplicate town/surname entries should be eliminated. (See Item #58). Also, merging users is currently much TOO easy to mess up... If you type one of the Researcher Code Numbers wrong, you end up merging two completely unrelated users -- and there's no "undo" feature... We should implement a confirmation "Are you sure?" intermediate page, showing the address details of both of the two users about to be merged, side-by-side, before proceeding. We already have a similar mechanism on the "Delete User" administrative interface. 7/2002 update -- Now also requires coordination with GoldMine... Should this function even be allowed on the JGFF side anymore?? I don't know how this is supposed to work in the new system... If it's not allowed, the merge function should be eliminated -- or documentation/warnings added to it. (Also see #43). ------------------------------------------------------------------- 26 - Better duplicate researcher detection. Many people register more than once in the JGFF... because they forgot that they registered before, or because they forgot their password so they decide to register again (despite all the admonitions against this that the registration pages contain), or because their email address changed, as well as other reasons. The breaking of the Researcher Address info into more specific fields (see #11 above) will greatly facilitate this detection. As of May 2001, there were 2,400 people in the JGFF with same email address... that's over 5% of all JGFF registrants... and there are plenty more duplicate entries -- for people who have changed their email address, or have added a new account with an email address, etc., etc. We should NOT allow a duplicate entry with the same email address, or with the same name and address, etc. The implementation has two parts: 1) Clean up of the existing duplicate researchers. Mostly a manual process. First find the dupes, by sorting the Researcher database by various fields (by email address, by surname, by city/country of residence, etc.). Then merge the user records of the duplicate users, keeping the record for the more-recently modified address... perhaps after some consultation with the user themselves, for confirmation. 2) Prevent future duplicate users from re-registering. This is a programmatic job; a code enhancement to the JGFF user registration process. For all new registrations (and updates), check the user input against the current addressee database for duplicate email addresses, duplicate GivenName/Surname, duplicate City/Country of residence, etc. Give the re-registering duplicate user more/better feedback, telling them the potentially duplicate JGFF Researcher Code #, and display the publicly accessible portions of that JGFF user record. We should NOT permit two JGFF Researcher entries with the same email address. If someone is serving as an email proxy for a third party, then the third party's surname/town data should be all under the same JGFF Researcher Code as the email proxy. *** Michael implemented item 2 (preventing duplicate email addresses during new user registration) on 11/20/2003. Dupe email checks during user updates not done yet. ------------------------------------------------------------------- 27 - Town name validation for places outside Eastern Europe. Currently, we only validate place names in Central and Eastern European countries -- because that has historically been a problematic area (different names and variant spellings for the same town, etc.) These currently "restricted" countries are specified via a "Y" in the "WARN" column of the COUNTRY2.DBF file. But users can enter *any* town name they want for USA, Canada, Western Europe, Asia, Africa, Latin America, Israel, etc. -- thus we get a lot of garbage bogus entries in the database. We can probably find some off-the-shelf or public domain database (USBGN?) or list of geographical names for most countries, to validate these entries against. Another, easier idea would be to impose the same restrictions for ALL countries that we currently have in place for Eastern Europe: No new town entries that aren't already in the database. Manual intervention from the JGFF Help Desk is needed to enter a new town in one of these restricted countries, via < http://www.jewishgen.org/jgff/TownQ.asp >. However, this might be too restrictive... perhaps we could relax that rule for particular countries (where we might expect to have many "new" towns), such as the USA/Canada... or slowly add one country at a time to the "validated" list, as we clean up the existing entries for that country (see #27a, below). The sooner that we add validation, the better. This will result in less garbage data being added to the JGFF. Another idea: For Eastern Europe, we could extend our current validation scheme to be "Town names are valid only if it is already in the JGFF database, OR if it is a primary native name in 'Where Once We Walked'." We already do the first clause; we could automate the second clause. Implementing that would result in one less case of Help Desk intervention. Then there's the issue cleaning up / validating all of our existing bad entries... things like the fourteen ways people have spelled "Philidelfia", etc. (see Item #27a). This clean-up will probably have to be a manual process... (but see item #35 re utilities that would help.) Also, we need a way of consolidating variant spellings: "St. Louis, MO", "St Louis, Mo.", "Saint Louis Missouri"; "Fort Lauderdale", "Ft. Lauderdale FL", "Ft Lauderdale, Fla.", "Fortlauderdale, Florida"; "St Petersburg, Russia"; etc. We need to develop a standard, regarding the spelling of town names which are often abbreviated (e.g. "St.", "Ft." prefixes) -- use the abbreviation without the period. We should provide auto-replace synonyms for these other names (See Item #56, re Town Synonym Table). Also see Item #3 above, re USA/Canada states/provinces. 5/30/2001, 2/11/2002. ------------------------------------------------------------------- 27a - Town name cleanup for countries outside Eastern Europe. In conjunction with town name validation (see #27, above), we need to cleanup the existing town data for non-East-European countries, so that when a country becomes a "restrictive" country (i.e. no entry can be added unless the town is already there), the data remains clean. I propose finding a few "region experts" for each region (e.g. Latin America, Caribbean, Northern Africa, Southern Africa, Middle East, Canada, UK, France, Italy, Scandinavia, etc.) who could easily validate the existing JGFF data. We would provide the "region experts" with a spreadsheet of the existing data, (similar in concept to the system at < http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgffclean2~xxx >), and they would fill in some columns, indicating corrections. We would provide them with a spreadsheet containing columns: - Country Name - Town Name - # of current JGFF entries for that town The "region experts" would return the spreadsheets, with additional columns filled in: - Status Flag: OK | Change | Unknown - New Country - New Town - Add Synonym: Y | N Instructions for "region experts": * If the entry is OK as it is, set the "Status Flag" to "OK". * If the entry is an obvious typo/mistake, set the "Status Flag" to "Change", and fill in the "New Country" and "New Town" fields with data that this bad entry should be changed to. If you feel that this is a common synonym for the place name, which other users are likely to use, set "Y" in the "Add Synonym" field. * If the entry is unknown/ambiguous/bizarre, set the "Status Flag" to "Unknown". We also need to make some decisions as to what our policy is regarding "native" names for non-East-European countries. Following the USBGN standard might not make sense for some regions, e.g. should "Jerusalem" be "Yerushalayim"? Should "Brussels" be "Bruxelles"? Should "Aleppo" be "Haleb"? I think that we need to have some exceptions for well-known world cities, use their English forms, and compensate using synonyms. These "region experts" could also be consulted on an ongoing basis, by the JGFF Help Desk, when new entries for these countries come in. 9/25/2003 ------------------------------------------------------------------- 28 - Security. Ensure that a FORM ACTION request to search the JGFF can come *only* from a page on a JewishGen.org host (www.jewishgen.org, www.shtetlinks.jewishgen.org, data.jewishgen.org, etc.). This security measure should apply to *all* JewishGen databases, not just the JGFF. This will ensure that no other site can "wrap" JewishGen databases, making it appear that they own JewishGen data. Several 'rip-off' genealogy web companies have sprouted, wrapping public databases and charging customers for access. (These include FamilyDiscovery.com, Genseekers.com, and Genealogy-Express.com). This security mechanism will also eliminates potential data mining by third parties, and in general gives us much better control regarding data access. 7/2001. Michael did a partial implementation 10/2001... but we were unable to accurately detect all of the cases. We don't fully understand the HTTP_REFERER environment variable -- it appears that it is sent by the browser, which makes it unreliable. Sometimes we are seeing a blank HTTP_REFERER value in cases of valid access (and these aren't bookmarked URLs). So the partial implementation is that only blank or jewishgen.org domains are allowed... Unfortunately, this does leave the door open for some off-site access. I posted our quandry to a webserver newsgroup, and one respondant recommended "a session cookie as the appropriate security tool. Make sure you have a redirect page for any user who has cookies disabled to explain how you need them to accept session cookies." Also see item #67, re use of an intermediate CGI script for security. ------------------------------------------------------------------- 29 - Integration into the "All Country" Databases. When someone does a search a JewishGen "All Country" database (e.g. "All Latvia", "All Belarus", etc.) for a surname or town, the matching surname/town entries for that country in the JGFF should also be returned, as a separate line-item in the All Country database's "Step One" results display. 7/2001. *** Michael implemented this (test mode) on 1/28/2003. Not yet integrated into All Lithuania. ------------------------------------------------------------------- 30 - "Another Search" link incorrect. This problem, which we have not been able to reproduce ourselves, has been reported by several users. It might be related to a particular browser or operating system or sequence of events or some combination thereof... more investigation is needed. Here is the scenario: After the results a JGFF search (with the URL being < http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgff~C >), when clicking on the "Another Search" link at the bottom of the display, the resulting URL is: < http://www.jewishgen.org/jgff/jgffweb.htm%23form#form >, which is a bad link resulting in a 404 "Page Not Found" error. It is unknown what might be causing this. The "%23" is the hexadecimal representation the ASCII value of the pound sign ("#"). One solution might be to explicitly check for the "%23" in the previous URL, and replace it with the pound sign character. Another variant that we occassionally see is a URL in the form: < http://www.jewishgen.org/jgff/jgffweb.htm#form#form > or < http://www.jewishgen.org/jgff/jgffweb.htm#form#form#form >, where the "#form" suffix is repeated multiple times. The solution might seem to be to simply hard-code the value of the "Another Search" link... but I don't think that we can do that, because a JGFF search can originate on other pages, such as < http://www.jewishgen.org/jgff/kiosk.htm >, and we'd want to return the user to their original location. So a solution to this scenario might be to check for multiple trailing "#form" suffixes, and remove them when creating the "Another Search" URL. Another variant that we sometimes see is the URL: < http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgff~C#form >. This occurs when a search is done while the JGFF is in "maintenance mode"; or if a search is initiated from a web form on a non-JewishGen site; or if the user's browser or security system isn't sending the HTTP_REFERER variable. 7/2001, 1/2003 *** Michael addressed some of these latter issues 1/30/2003. Needs to be thoroughly regression tested. ------------------------------------------------------------------- 31 - Documentation for JGFF FAQ... Some items have been overlooked, or could be made more explicit: * How to delete a JGFF entry. * How to delete a JGFF user. (See Item #53). * The JGFF LostNFound service, for bad email addresses. DONE. * New Privacy options... How to contact researchers who opted to display only their researcher number. DONE 5/2002. * New D-M Soundex option -- use of bracket [] characters. (See #2 above). SearchTypes.html InfoFile. * JGFFAlert system. DONE 5/2002. 7/2001, 3/2002, 5/2002 ------------------------------------------------------------------- 32 - Create a way to temporarily "hide" a researcher. Sometimes we might want to temporarily hide a suspicious user and their data from public view. We have this capability in the FTJP, but not in the JGFF. Currently the only way that we can do this is to DELETE the user entirely... and then need to re-create them manually... under a different JGFF Researcher Code. This is too cumbersome. 8/2001. ------------------------------------------------------------------- 33 - Provide an "heir" for deceased researchers. When JewishGen is notified that a JGFF researcher has died, an administrator marks that researcher as "Researcher Deceased" in the CURE database (Status Note field (ustatres) = "RD"). (Previously, we also optionally provided the month/year of death, which I think that we should re-instate, in the CURE Notes field). We have no way of verifying this information.... JGFF users don't notify us when they die '8-)... We rely upon others to report this information to us, and we take them at their word. Our policy has been to leave a deceased researcher's town/surname information in the JGFF database, rather than delete it, for the potential benefit of future researchers -- who might be interested to know that at some time, there was some relative interested in this particular town/surname. However, for SOME deceased researchers, there is a genealogical "heir"... someone who inherited or picked up their genealogical research. We should develop some way to indicate this in the JGFF database. There is an equivalent issue in the FTJP... but there is also an additional issue there -- the "heir" should also inherit the rights to the GEDCOM file -- to updates it, etc. To enable this, we should allow researchers to specify an "Alternate Contact" person -- their Name, snail-mail Address, Telephone number, Email address, etc. This would need to be stored somewhere in the CURE database, and we'd need to give the researcher the ability to periodically review/update the Alternate Contact's information (just as someone can update their will periodically), to ensure that the info is current. This "Alternate Contact" concept might apply for non-deceased researchers as well: It would be another person who would have the ability to access/update this researcher's information. 8/2001, 11/2003. ------------------------------------------------------------------- 34 - A tutorial. Provide some sort of guided interactive tutorial for first time users. Something containing very simple basic content on the JGFF -- presented in the form of a video, or Macromedia Flash presentation, etc. Could use a professional educator to do this. ------------------------------------------------------------------- 35 - Administrative utility to modify/query town names globally. Provide a little utility for JGFF administrators, to use for the mass-conversion of all instances of one town name in the database. For example, to change all instances of "Ungvar" in the database to "Uzhgorod"; or all instances of "Kishinev" to "Chisinau", etc. -- and to then UN-validate the removed name. Right now, we have to do all of the town name changes manually, researcher by researcher, surname by surname, town by town, for all researchers who had listed the old town name (mostly during the pre-validation 1998-1999 period). There should be four required inputs: Source town and country, and Destination town and country. This utility should take precautions, so that no un-doable damage could be done easily... perhaps have an intermediate "Are you sure" page, listing the number of entries that *would* be effected by the requested change, before a "commit" button is pressed. The utility should ensure that if a user already has an entry with the destination name (i.e. had entries for the same surname with both the old and new names), a duplicate entry is NOT added. (See Item #58). Should be added to the JGFF maintenance page at < http://www.jewishgen.org/jgff/jgffmain.htm >. Of course, this is really just an interim step, which can be used for individual towns, before the mass-conversion of Soviet-era Russian-language names to the new native Ukrainian, Belarussian and Moldovan names -- which we need to do once the new "WOWW2" comes out (in late November 2002), and we can roll out the revised ShtetlSeeker (see items #1, #1a, #1b). After the USBGN Feature Codes are added to the JGFF for the initial ShtetlMaster implementation (see item #1b, #7), then this utility should be expanded, to be keyed off of the USBGN Feature Code number. Of course, if the suggestion in #1b is implemented -- to move all town names into a separate TOWNS.DBF file -- this entire design is moot, and we'll need a completely different type of utility to manage the TOWNS.DBF table. Also see item #56, Town Synonym Management Utility. Query utility: ~~~~~~~~~~~~~ Also could use an administrative function to query the database without restrictions, e.g. bypass the restrictions of wildcard characters, to be able to see ALL entries for a particular country (using "*"), etc. Filters by date, Researcher Code #, etc., should be allowed. Ability to exclude items (e.g. "NOT") would be a plus. Raw SQL-like queries are fine with me... Should have the ability to download the results to a spreadsheet... to allow an administrator to see/sort all data, to find and enable corrections. ------------------------------------------------------------------- 36 - Remove visible passwords from URLs. When modifying your JGFF entries, the user's password visibly appears as part of the URL of the acknowledgement display page, i.e. http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgffupdt~qwerty and http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgffview~%2012345~qwerty ~MODRES~GEDCOM, where "qwerty" is the user's password. This would be considered a security violation by some, and should be removed. The password should be passed to this page in some less visible manner -- perhaps a hidden form variable, or better yet, via encryption. 10/16/2001 ------------------------------------------------------------------- 37 - Supress some warning messages from data entry ack screens. The message generated by the FORCE flag, "This was checked against other JGFF data and would have been REJECTED, but a JGFF supervisor has FORCED the entry" should be supressed from display on the data entry acknowledgement screen. This message is no longer meaningful, and only causes user confusion when viewing the data entry acknowledgement page. The implementation of ShtetlMaster linkage (see items #1b, #7, #50) will also make these messages obsolete. An entry which has been "forced" (i.e. does not have a corresponding valid USBGN Feature Code number) will be indicated via the lack of a hyperlink to the corresponding "ShtetlMaster Locality Page". 10/17/2001 ------------------------------------------------------------------- 38 - Town sponsorship banners. In addition to our current rotating JewishGen-erosity $1K banners displayed at the top of each results page, we could also implement a "per town" sponsorship. Initial request came from Howard Margol, for Pushelot, Lithuania. Partial implementation (samples for Ostroleka, Poland; and Pushelotas, Lithuania) were completed 10/30/2001. To do: * Design a more stylized display of the sponsor's name (use a box with a gold background, like the Yizkor Book plaques?). * Develop and document a pricing model (annual, lifetime...?) * Implement/document an online maintenance system. ------------------------------------------------------------------- 39 - Better error feedback for Address Entry/Edit screen. In the re-designed JGFF Addressee info screen, most of the fields are now mandatory. However, if you leave one of them blank (or fail to select an item from a pulldown), you recieve the following error message: "You MUST enter your full address before you can be added to the JGFF. Please press the back arrow on your browser and try again!." This message has confused many people, because it is too vague. Many users have written to support, saying that they are unable to complete the addressee form... and we sometimes can't adequately help them, because we can't tell WHICH field they missed or completed incorrectly. [This often happened when a user tries to update an address in the "old" address format -- before we added the separate "City" and "Country" fields (in Sept 2001, see Item #11). The address LOOKS correct to them, but it isn't accepted, because the data is in the wrong fields -- they need to move their city from the old "Address" field into the new "City" field, they need to select their country from the pulldown, etc., which are all new fields. See instructions at < http://www.jewishgen.org/jgff/jgffupdate.html >.] Solution: ~~~~~~~~ The error message above should be more precise and less generic. It needs to specify WHICH field is blank or missing, to give better feedback to the user. Add a middle line to the above error messsage, for example: "Country not selected", or "Street Address field is incomplete", or "City field is blank", etc. Eventually... ~~~~~~~~~~ We could have some more sophisticated form validation feedback mechanism, using JavaScript, as is done on other sites, e.g. color/highlight/flash the bad field(s), etc. See related Item #52, re validation. 11/2001. ------------------------------------------------------------------- 40 - Re-indexing. The JGFF database needs to be periodically "re-indexed". This is a matter of internal database inconsistency -- the symptom is that some names found in the search results will NOT match the input criteria; and conversely some names in the database that should match the search criteria will NOT appear in the search results. This inconsistency is occurring with more and more frequency lately -- perhaps every other day. The cause of this is currently unknown (Michael suspects that this could occur when two users try to simultaneously update the database); but the solution is known -- a process needs to be run which will "re-index" the JGFF data, putting the index back in sync with the data. This re-indexing process requires that the JGFF be off-line for about 5-10 minutes. This re-indexing is currently a manual process, which can be done only by Michael, requiring direct system access via pcANYWHERE or Terminal Server. What we need to do is to automate this process, so that: 1) Re-indexing can be performed by any JGFF administrator, at will, via a web form in our administrative area (password-protected). 2) Re-indexing can be performed automatically, via a regularly scheduled batch task -- say every night at 3am CST (our slowest time). While the re-indexing process in running, access to the JGFF is limited. A user attempting to use the JGFF during the re-indexing process should receive a warning message about the JGFF being temporarily unavailable due to necessary regular maintenance, and to try again in a few minutes. But ideally... we should find and fix the root cause of the problem. A properly functioning transactional database should not ever require re-indexing. 11/2001. ------------------------------------------------------------------- 41 - Testimonials. It would be nice to have some JGFF testimonials somewhere on the site, for marketing/publicity/PR purposes. This would require someone with writing/editing skills to solict/collect testimonials from JGFF users; and write them up attractively... as part of the JewishGen "Press Center". We've received many testimonials over the years, in various forms, e.g. "You helped me find my long-lost cousin... Without JewishGen, I never would have known that...", etc. Some of these messages have come in to the "JGFFHelp" list, others have been posted to Support or the Guest Book or the various mailing lists... 11/2001. ------------------------------------------------------------------- 42 - Automated Password Retrieval Create an automated system that will allow researchers to retrieve their JGFF password automatically, without the manual labor that is currently involved in this process. There are similar systems in place elsewhere on the net. Our current "I forgot my password" system is manual. The Password Form at < http://www.jewishgen.org/jgff/password.html > simply generates an email message, which is retrieved and processed manually by a volunteer at our "Password Desk". A Password Desk volunteer then manually looks up the user's password, and emails it back to the requestor. There is often a large delay in this process -- it could be a day or more before the Password Desk volunteer views the email message, does the research, and responds to the requestor. The Password Form should be modified, so that a *program* receives the request first, and if the JGFF Researcher Code and email address match, it automatically emails the password back to the requestor -- no manual intervention involved. So if the user can remember their JGFF Researcher Code but not their password, and the message comes from the same email address that is listed for that Researcher Code number in the JGFF researcher database, the program would send them the password via email. If the Researcher Code and Email Address do NOT match, the program would then pass the message on to the Password Desk, for manual processing, as is done currently, We can't look up the Researcher Code or Password based solely on an email address, since we have thousands of duplicate email addresses in the database (see #26 above). However, I propose that we eliminate duplicate email addresses in the Researcher database -- Once that becomes the case, we can send a response based upon email address only, which will be a great benefit for those who forget their Researcher Code number. Response screen: After the user presses the "Send Request" button on the Password Form, the subsequent acknowledgement page should contain one of the following messages: * If the input Researcher Code Number and Email Address match in the Researcher database, the password should be emailed immediately, so the message should read: "An email message has been send to containing your password." (The same should be done if they leave the Researcher Code # blank, and the Email Address address is unique in the Researcher database). * If the Email Address is not in the Researcher database (or is not unique in the database), the request will have to be handled manually. In this case, the acknowledgement page's text should be something like: "The email address has not been registered in our database. Your information has been submitted to the Password Desk. A volunteer at the Password Desk will research the matter and respond to you." This assumes that the process is fast enough to locate the email address in the Researcher database in real time, i.e. while the user is waiting for the acknowledgement screen... this might require hardware/software upgrades, or adding an index on the email field (if there isn't one already). Automating the Password Form would save the volunteer manpower of looking these up, reducing these requests to only those whose email is out of date, or those who have forgotten everything. It would also speed up the response time for the users -- it would become virtually instantaneous, instead of taking hours or days for a response. For those researchers whose email has changed or is no longer valid, this system obviously won't work... Those would still have to be handled manually. For security reasons, we do not want the program to guess or make assumptions about users and send out passwords... we do not want to send out passwords to the wrong people. Passwords should NOT be printed to the screen -- they should only be sent via email, to an address already in the Researcher database. 11/2001, 1/2003. ------------------------------------------------------------------- 43 - Administrative notification of researcher address changes. Because JewishGen's accounting/fundraising data system are not directly tied to the JGFF database (they use QuickBooks or now GoldMine), the JewishGen administrative staff needs to be manually notified whenever a JGFF addressee record changes, so that they can update the corresponding data in their system. They should be notified via email: - If someone changes their mailing address (a user function). - If someone changes their email address (user function). - If a JGFF researcher record is deleted (admin function). - If two JGFF researcher records are merged (admin function). The email address(es) of the notification recipients (JewishGen administrative personnel) should be configurable via a Lyris mailing list. This email notification will help us keep the JewishGen administrative systems in sync with the all the other systems. 11/28/2001 *** Partial implementation of integration with GoldMine, 7/2002, by "Professional Edge" software contractors. ------------------------------------------------------------------- 44 - Annual Renewal Notices. In order to keep the researchers' address and contact information current, perhaps we should send out annual "Renewal Notices" to all JGFF researchers. The notice would contain the researcher's contact information (postal address, email address), as well as their complete list of JGFF ancestral surnames and towns. The notice would encourage them to validate their listings and update them if necessary... and we could use this opportunity for other "marketing" reminders as well. This might be a more effective "personalized" way of communicating with the JGFF's users. This will also help us to be more pro-active in identifying and weeding out bogus users, duplicate entries, bad email addresses, etc. The renewal messages could be done all at once on a particular date each year, or in batches throughout the year, perhaps on the anniversary of the user's registration, or in monthly batches, etc., if that's easier to handle... Support needs to be prepared to deal with the influx of password requests / updates / questions / complaints that will be generated as a result of these Renewal Notices. As an additional enhancement, which would really ensure that we always had up-to-date info... perhaps we should REQUIRE that each user CONFIRMS their contact information annually -- via an automated web form. The "Renewal Notice" email message would contain the custom URL of a dynamically-generated web form for that specific user (the user would have to input their JGID and password to continue). If the user does not confirm within say 2-3 weeks (by clicking some simple "This data is OK" button on the custom webform), then another Renewal Reminder email goes out, etc. If they don't respond at all after some period of time, then we put some sort of notation on that user's researcher record (e.g. "No response since 2/22/2002" or "Contact information last confirmed 2/22/2002"), which would be displayed (in a small font) on that user's record in the JGFF search results. Note that this new "Date Last Confirmed" date is different than the Researcher record's "Date Last Updated" field. The LostNFound team could manually follow-up on those users who do not respond at all after some period of time. 2/10/2002, 2/22/2002. ------------------------------------------------------------------- 45 - Researchers with no data - better "Modify" interface. There are three options on the "Modify Entry" page at < http://www.jewishgen.org/jgff/jgffview.htm >: * Modify Researcher Information (Your Name/Address/Password/Display) * Modify Surname/Town Information (Surnames/Towns/Countries) * Add Additional Surname/Town Information (Surnames/Towns/Countries) If a researcher hasn't entered any surnames/towns, then selecting the second option yields a long page of instructions, with the notice at the bottom of the page: "You do not appear to have ANY surname/town records in the JGFF. Please return to the MODIFY page and choose the THIRD option (Add Additional Surname/Town Information) to add your data." My suggestion is: If the researcher has no data, then *don't* print all of the instructions -- just print the notice. This will make the page 90% smaller, and make the situation much more obvious to the user. Even better: Instead of printing the static instructional text "Please return to the MODIFY page and choose the third option...", create a dynamic button, which will go directly to the "Add Additional Surname/Town Information" page for that user. We should already have all of the required information (JGFF Researcher Code and Password) in hidden FORM fields. This would be a smoother user experience. Even better: Go directly to the "Add Additional Surname/Town Information" page for that user. Thus the last two options would do exactly the same thing, for users with no data. 2/26/2002, 9/9/2002. *** DONE on 5/5/04, in the CURE JGFF. ------------------------------------------------------------------- 46 - Sort order -- Display results in a user-selected order. Currently, the JGFF's search results are always displayed in order by JGFF Researcher Code -- which is not a particularly useful arrangement. Ordering the results by "Surname", "Town", "Country" or "Date Last Updated" would perhaps be more valuable arrangements. We could allow the user to select how the results should be ordered for each search. The sort order could be selected either via a drop-down menu on the query form, and/or also via clickable column headings on the Search Results page -- clicking on the column heading would re-sort the results in the order of that column. One potential issue with using a different sort order is that there could be multiple matches per researcher. Currently, all of the matches belonging to a particular researcher appear together -- so the researcher's Contact Information needs to appear only once. If we use a different sort order, then each of the matching entries for one researcher could be spread out in separate places in the list -- so the researcher's Contact Information would need to be repeated multiple times on the results page, once for every time one of their entries appears. Repeating the Contact Information would certainly be the easiest solution, and there's inherently no harm in that, other than that the output is larger because of the redundancy. We might be able to live with that... especially if at the same time Item #5 was implemented (see above "Display search results in batches"). Or we could redesign the output, back to something like the olde ancient version, where the matches were displayed and only showed JGFF researcher code, and then the second part of the results was a listing of researchers ordered by code, with their associated Contact Information. But this would be an unattractive display, and would probably confuse many users -- so I would opt for the first solution. 3/4/2002, 1/25/2003. ------------------------------------------------------------------- 47 - Multiple towns with the same name. Need to determine a way to deal with two or more towns in the same country with identical modern native names. We need to be able to differentiate between them. Currently, such listings are ambiguous. We could use the WOWW approach, and add a parenthetical phrase e.g. "(near Krakow)" or some other qualifier to the *smaller* of the localities, to differentiate it from the larger town of the same name. The larger "Jewish" town should appear *without* a parenthetical qualifier. The use of parenthesis would allow the search engine to still work properly -- since it ignores all text within parenthesis. Behind the scenes, each should be assigned its appropriate USBGN Feature Code, so that when the locality name is clicked on (see item #7 above), more details about that locality are presented. 3/5/2002. ------------------------------------------------------------------- 48 - "Dead" towns. There are a few rare actual instances of "the town no longer exists -- the town was wiped off the map" -- most often due to post-war urbanization, not WWII. These localities are NOT in the USBGN/ShtetlSeeker (our "locality Bible"), and thus they have no "modern native name" or USBGN Feature Code. Hence our problem. We should probably allow these entries, on a very limited case-by-cae basis, each supervised by a JGFF editor. Each entry should be accompanied by some note indicating that this town is not in the USBGN, and approximately where it was located. In the column of "USBGN Feature Code", some special indicator should be used, to indicate that this locality is not in the USBGN database. 3/5/2002. ------------------------------------------------------------------- 49 - Prevent user entries with no Surname/Town data. A JGFF user entry without any Surname/Town entries is a useless entry, and should be avoided -- it clutters up the researcher database, and these are also more likely to be bogus entries. The JGFF's current multi-step registration process does make this requirement difficult to enforce... because we never know when they're "finished". (Entering their Surnames/Towns is the third and final step of registration). One method of enforcement might be that each day, as a batch job, we simply invalidate all new entries more than X hours old, which don't have any Surname/Town entries -- or send them a canned email message, stating that they now have X days to enter some Town/Surname data. A more sophisticated method would be to have a confirmation email sent to them when they completed the third step -- and that email would contain a URL that they'd have to click on to continue with a new fourth step, which would be required to complete the registration process. Some online merchants have a setup like this (Amazon? Ebay?). This facility would be more complex for us to create, but has the added value of ensuring that each user provides a truly valid email address. A registration would not be considered complete and valid until this fourth confirmation step was completed. (See Item #24, on email address validation). Notes: * Note that it might still be possible for a user to have no Surname/Town entries, if they delete all of their entries at a later date. To find these users, we would need to scan through the entire database periodically. * If we will be using the JGFF User Database as our "master" user database for all JewishGen projects, we might have to allow some entries with no Surnames/Towns, such as for contributors of data to other projects (JOWBR, FTJP, etc.), volunteers, financial contributors, etc. There should be some other indicator in the researcher database for this case. * The current FTJP user interface does not allow a user to search the FTJP unless they have at least one Surname/Town entry in the JGFF. 7/1/2002 *** Update 2004: With the advent of CURE, it is now possible and permissible for registered CURE users to have no JGFF entries, so this item is moot and closed. ------------------------------------------------------------------- 50 - Town name data entry - More assistance for users. Users often have difficultly determining the correct modern native name for a town, and required by the JGFF rules. For places in Eastern Europe, we currently reject placenames that aren't already in our database. When a place name in Eastern Europe is rejected, the user is currently presented with a small set of choices on the data confirmation/rejection page -- these are possible town names within the region which are Soundex matches for the user's input name. For each possible locality, we present three fields: - Town Name - Country Name - Current Number of JGFF Entries for that town. We can improve upon this display, and give the users more information, allowing them to make a more informed decision. In addition to the three data items mentioned above, once we have the town data linked to USBGN Feature Codes (see Items #1, #1a, #1b above), we could also provide the locality's latitude/longitude, alternate names, links to maps, etc. -- and/or ideally -- a link to the locality's "ShtetlMaster Locality Page" (see Item #7, above) in a small pop-up window. We could also expand the list of potential soundex matches to include town synonyms. Eliminate "see" synonyms: ~~~~~~~~~~~~~~~~~~~~~~~~ In the list of potential soundex matches, we should NOT present the dozen "see" references, such as "Warsaw", "Danzig" -- which we include in these lists currently. These should be filtered out (or eliminated from the DATA.DBF file entirely), because they are not valid entry selections. (See Item #56). 7/10/2002, 9/18/2002. ------------------------------------------------------------------- 51 - Integration with the "JewishGen Family Links Database". The "JewishGen Family Links Database" is a small database of family-based web pages, at < http://www.jewishgen.org/family >. It is a totally separate system, not integrated with the JGFF, CURE, or any other JewishGen system. The Family Links Database facility is under-utilized -- there are about 1,000 records. Its integration with the JGFF would greatly promote its use... as well as enhance the JGFF itself, because researchers would then have instant access to more information about a particular family located during a search. The Family Links Database associates a URL with a surname. I propose that when we integrate this with the JGFF, we associate the URL with a *researcher*, rather than with an individual surname, since many of the URLs are for web pages concerning multiple surnames (multiple families) created by the same researcher. Associating the URL with a researcher will also be much easier to implement, and has a lesser impact on the our data structures: We add an optional URL field to the user's CURE record -- the JGFF's data files are not affected. The URL would appear in the JGFF's search results display, in the far-right column, along with the researcher's contact information. Once this integration is achieved, the Family Links Database basically disappears. (Tho we'll have a small migration issue). The Family Links Database also has a facility for *creating* family web pages, for those people who don't already have a family web page. These user-created family web pages are hosted on the JewishGen web server. This part of the Family Links Database should probably remain, in some form. There are about 150 family pages of this type. Problems/Issues: - Validation of URLs: Upon data entry, we need to ensure that the URL points to a valid accessible web page. There also needs to be some on-going monitoring of URL validity, probably via Elsop LinkScan. - Monitoring of content: This is more difficult and problematic. How do we insure that the referenced web site does not contain any objectionable material? Do we care? Do we need a legal disclaimer, etc.? This isn't a greater problem than that which already exists in the Family Links Database... we just haven't been monitoring it. It's the same issue which affects off-site ShtetLinks sites, and should be handled similarly. 7/12/2002 ------------------------------------------------------------------- 52 - Validation of all user inputs. In addition to validating email addresses (see item #24) and town names (see item #27), etc., we really need to be validating ALL user input fields up front, to the whatever degree possible. * In the JGFF Researcher Record -- we need to ensure that certain fields are mandatory, e.g. Researcher's Given Name, Surname, Street Address, Town, etc. are all non-blank and contain more than one character. (See Item #39 re giving feedback). These should ALSO be looked over by a human with a few days of entry, and obvious bogus and invalid entries should be deleted. No machine will ever be able to detect purposefully bogus entries -- a human is always needed for the final check. * In the JGFF Data records -- Surnames and Towns -- we already have some minimal validation for towns in certain countries, but need to expand this to other countries (see items #3, #27). For surnames, we need to prevent the entry of all invalid characters, such as numbers and all punctuation ["?" "," "." "/", "()", etc.] (see item #12). This could be done at both the JavaScript level (client side), as well as in the backend (server side). Note that if we do JavaScript (client side), we STILL need to do it on the back end as well -- because people can disable JavaScript in their browers, and older browsers don't support it at all. Implementing this checking in client-side JavaScript would be a good thing, because it gives more instantaneous feedback to the users. The more validation we do up front, the cleaner the data in the system will be, and the less hassles and maintenance we'll have down the road. See some sample basic JavaScript routines that I wrote, at < http://www.jewishgen.org/wb/Keyboard.html >, which could be adopted to validate user inputs for valid characters in the Surname and Town data fields. Some validation could be similar to the new CURE user registration modification screen, where the same page is re-displayed, with the erroneous entries highlighted in RED. 7/19/2002, 11/23/2002, 10/16/2004. ------------------------------------------------------------------- 53 - User deletion. There is currently no way for a user to completely DELETE their JGFF Researcher Record. They can delete all of their Surname and Town entries; they can modify their Name/Address/Options in the Researcher Record; but they can not DELETE their Researcher Record. A Researcher Record can only be deleted by a JGFF administrator -- so the user needs to send an email to jgffhelp to request deletion. Only a handful of users have requested this so far. We should perhaps have an end-user function which allows a user to completely delete their Researcher Record. Of course, we should not encourage this -- this function should be "difficult" to find on the site, but it should be possible. There should be strong warnings and "Are you sure?" prompts given to the user at all appropriate points. Perhaps this deletion form should have a mandatory field asking "Why have you chosen to delete your JGFF Researcher Record?", and/or other simple market research questions. This function needs to ensure that all of the user's associated material (FTJP entries, etc.) are also deleted or disassociated. This function needs to be tightly integrated with GoldMine, to ensure that the parallel function occurs on both sides. 8/13/2002. *** Update 2004: This item now moves to CURE, not JGFF's arena. ------------------------------------------------------------------- 54 - User "NOTE!" and Auto-alert settings fields. These three fields in the user's Researcher Information record are currently editable as text input fields in the web interface, when using the administrative password. I believe that these should now become *read-only* fields on the web form, since they should now be set only via GoldMine. If this is not the case, the editing rules should be clarified and documented, and then these fields should become a pulldown SELECT menu and radio button, respectively -- so that invalid data is not entered into these fields on the web side. Also see related Item #61. 9/11/2002. ------------------------------------------------------------------- 55 - Country list consistency. The list of valid JGFF countries is not consistent between the Search form and the Data Entry/Modification form. The Search form, < http://www.jewishgen.org/jgff/jgffweb.htm >, uses the country list in "/jgff/country_list.txt", via SSI (Server Side Includes); while the Data Entry and Modification forms, which are dynamically generated, use a different list of countries. For example, the following country order are OK in the /jgff/country_list.txt used on the search form, but are out-of-order in the country list used in the data entry and modification screens: - Channel Islands - Netherlands / Netherlands Antilles (swap) - New Zealand - South Africa - Slovenia / Slovakia (swap) While I believe that the content of the two lists is the same (this should be verified), they each present the country list in a different *order*. The list in the Data Entry screens is not in alphabetical order, and should be. Ideally, both forms should use the same source. 9/11/2002. ------------------------------------------------------------------- 56 - Town Synonym Table - Manager Utility Need a small administrative interface to manage town synonyms. Tying the JGFF town names to ShtetlSeeker (see items #1a, #1b, #7, #8) will assist greatly with the management of town synonyms... but we will still need the separate database of synonyms which are NOT in the USBGN data, such as some of the town synonyms that we currently use in the JGFF -- some Yiddish names; synonyms for those countries not covered by the USBGN (e.g. "New York" -> "New York, NY", etc.); common typos; and region/province names. It would be good to have a simple utility to manage all these synonyms: Search/List, Add, Delete, Modify. The synonym table should also incorporate the dozen "see" references (Warsaw, Danzig, etc.) currently in the DATA.DBF file with a Researcher Code of "1", and those entries should be removed from the DATA.DBF file (See Item #50). Should be a check to avoid circular references within the synonym table. The synonym table should include auto-replacements for: - English names of major East European cities: e.g. "Warsaw" -> "Warszawa"; "Vienna" -> "Wien" - Completion of state code for major US cities: I can provide a list of the 100 largest US cities... e.g. "Chicago" -> "Chicago, IL" - Commonly misspelled names: e.g. "Pittsburg, PA" -> "Pittsburgh, PA" - Placenames with alternate spellings: e.g. "Saint Louis, MO" -> "St Louis, MO" - Region/Province names: e.g. "Podolia" -> "Podolia, (Gubernia)" "Courland" -> "Courland, (Gubernia)" "Galicia" -> "Galicia, (Region)" - Abbreviations for all US States and Canadian Provinces: e.g. "FL" -> "Florida" Would also be nice to have a few "global" town synonyms, which are applicable within ANY country, for auto-replace, i.e. "Anytown", "Anywhere", "Amy Town", "Any Where", "Unknown", "City Unknown", "Everywhere", "All", etc. all become "Any". Also see related Item #35, an administrative utility to modify/query town names globally -- which this utility might supplant if Item #1b (TOWNS.DBF table) is implemented. 9/11/2002. ------------------------------------------------------------------- 57 - Alphabetic case inconsistencies. I notice when perusing through the DATA.DBF file that some town and country names are capitalized inconsistently. Most often, the names are are initial-letter uppercase, with the remainder in lowercase -- but some town and country names are in all lowercase (Researcher Codes in the 34000 range), or are in all uppercase, or have some all uppercase words (e.g. "Rio DE Janerio", "Newcastle ON Tyne"). This probably isn't a real problem internally, but it does make sorting a bit more difficult, and this data does display inconsistently to users, and should be cleaned up. The data should be made consistent -- both by cleaning up the existing data, and by programmatically enforcing the "only initial-letter uppercase" rule for all future input/edited data. 9/20/2002 ------------------------------------------------------------------- 58 - Duplicate and blank Town/Surname entries. In the DATA.DBF file, there are many instances of identical Town/Surname entries for the same user. Some of these dupes might be the result of a user merge (see Item #25). We should run through the entire database, eliminating these entries. These dupes also should be programatically prevented from re-occuring (The VFP "candidate index" feature can define a set of fields to be unique in a table structure). Blank rows -- There are some 11,000 empty (or nearly empty) rows in the DATA.DBF file. Most have only a "Source", "Gotcha" and "Last Change" field -- with nothing in the "Code", "Surname", "Town" or "Country" fields. I assume that these are leftover as the result of some delete/merge operations. They should probably be cleaned up, to compact the database. *** Update 11/2002: About 7,000 duplicate records were cleaned up programatically. There will be problems implementing the VFP "candidate index" because of the way that Surname/Town records are deleted -- by blanking them out instead of a true delete. This algorithm would need to be changed, in order to use the candidate index. 9/23/2002 ------------------------------------------------------------------- 59 - Enhance the statistics report. Add some features to the internal JGFF statistics report at < http://www.jewishgen.org/jgff/Admin/jgffstats.htm >. Some of these stats are currently extracted and posted in the JGFF-FAQ and other announcements; others are useful internal metrics, or might make interesting additions to report publicly. * For each country, add the number of *unique* localities. * Add a report listing the top XX town. * Add a report listing the top XX surnames. * Add a report listing the average number and distribution of surname/town entries per researcher (e.g. how many researchers have only one surname/town entry, how many have two, how many have 3-5... 10-15... 16-32... how many have 32-48... how many have more than 48). * Distribution of JGFF Researchers' locations: How many reside in each country, US state, etc. (This might be more of a CURE user analysis feature now). * Eliminate statistics which are no longer meaningful, due to CURE (e.g. "Last Code", "# Researchers"). Perhaps these might be easier to generate as separate reports, due to performance considerations. 9/24/2002, 10/31/2002, 10/17/2004. ------------------------------------------------------------------- 60 - Impose a limit on the number of surname/town entries per user. We currently do not limit the number of surname/town entries that a user can make. The JGFF-FAQ says that the limit is "99" (see < http://www.jewishgen.org/jgff/jgff-faq.html#q4.3 >), but that limit is not actually enforced in the JGFF code. A handful of users have abused the limit, and have hundreds and hundreds of entries -- obviously far more than just their direct ancestral lines. The JGFF entries should basically be limited to the principal surnames that one is researching; only the direct ancestral lines. But it appears that some users have entered everyone on their family tree. Typically, the data should consist of the surnames and ancestral towns of your four grandparents, your eight great-grandparents, your 16 great-great-grandparents, your 32 g-g-g-grandparents, etc. That's 64 surnames for 7 generations, 128 surnames for 8 generations. It's doubtful that many Jewish genealogists could complete a pedigree of 8 generations for all lines. Of course, there are varient spellings of each surname, and multiple towns per surname, important collateral lines, etc. Nonetheless, I feel that there should be some upper limit... perhaps 200. Existing users that exceed the limit can stay. 9/25/2002 ------------------------------------------------------------------- 61 - List JGFFAlert status info on Researcher Info page. We currently list the JGFFAlert status and expiration date fields on the Researcher Info page *only* when an administrator views the page (i.e. entry view the administrative password). This data should be made available to all users. This affects the output of both the "Modify" and "List" pages. Each user should be able to see their own JGFFAlert status and expiration data -- as read-only fields. In Admin mode, these fields are labelled "Auto Alert setting" and "Auto Alert renewal (dd/mm/ccyy)", respectively, and are editable (see related Item #54). In non-admin mode, the labels should be more user-friendly. For those users who have not yet signed up for JGFFAlert, this space can be used as a plug/promo for JGFFAlert, with a link to the JGFF-FAQ section describing the JGFFAlert system < http://www.jewishgen.org/jgff/jgff-faq.htm#q3.7 >. 9/26/2002 ------------------------------------------------------------------- 62 - When "No matches found", offer suggestions. Currently, if a user searches the JGFF and there are no matches, on the results page there's the simple message: "Sorry, but there were no matches. Please try another search or wait until more data has been added." I suspect that some users might be discouraged by this message. Other users might be searching in the wrong way. We can offer some helpful suggestions, which are outlined the the JGFF FAQ Q #3.6: < http://www.jewishgen.org/jgff/jgff-faq.html#q3.6 >. For instance: - If the user has entered BOTH a surname and town name and received no matches, then offer the first suggestion (to search for surname or town separately). - If the user is doing an "Exact Spelling" Surname match, then offer the second suggestion (to use D-M Soundex)... Or even show them a list of the surnames that would be matched by D-M Soundex and Partial Text searches, showing the number of hits for each, hyperlinked. (See Item #66, re showing additional potential matches). - If there are no matches for the town name, then offer the third suggestion (remind the user about modern town names). Also see Item #8, on town name synonyms. If more than one case above applies, each of the relevant hint messages should be provided. In all cases, a hyperlink to section 3.6 of the JGFF-FAQ should be given. 10/2/2002 ------------------------------------------------------------------- 63 - Separate search types for Surname and Town Name. Currently, the Search Type (Exact Spelling Match, D-M Soundex, etc.) applies to BOTH the Surname and Town Name parameters. A nice enhancement would be to allow each to be specified separately -- it would provide for more refined searches. We could place a "Search Type" pulldown underneath each the "Surname" and "Town" input boxes on the search form. While this might be a little more confusing for some users, it has the potential to be very useful. For example, someone might want to do a soundex search for a surname within a specific locality -- actually, that should probably be the default setting: "Exact Match" for the town name, and "D-M Soundex" for the Surname. Having the surnames and towns in separate tables (see Item #1b) would probably make the implementation of this much easier. It also might resolve some of the odd search bugs (see Item #2a). 10/4/2002 ------------------------------------------------------------------- 64 - "Show all my matching surname/town entries" function. A web-based function to get all JGFF Town/Surname entries that are potential matches of any of a user's own JGFF entries. Based upon the JGFFAlert's match algorithm (see Item #21): Would find all entries that are EXACT matches for the Town/Country name, and a SOUNDEX match of the Surname. This saves the user the labor of searching for all of their surnames/towns individually. Of course, this function will miss possible entries of interest -- i.e. the same surname in a nearby town, a surname that's not a soundex match, etc. Users should still do individual surname/town searches, for thorough research. That can never be automated. Should include a "only show items updated since [date]" filter, as the standard JGFF Search form does. This function would be a "value-added" feature -- available to us internally, and publicly only to those donors who give over a certain level. *** Michael completed implementation 10/27/2002: < http://www.jewishgen.org/jgff/jgffmatches.htm > ... but there are performance issues. It works only if you have the correct code/password AND the researcher is a valid JGFFAlert subscriber; or you can use the master JGFF password. It is slow... it can take over 100 seconds to run, slowing everything else on the system down, and your browser may time out... thus we will likely have to wait for a hardware upgrade before releasing this to the public. 10/27/2002 ------------------------------------------------------------------- 65 - "Blind Contact" page - fix footer. On the bottom of the "Blind Contact" form page at http://www.jewishgen.org/wconnect/wc.isa?jg~jgsys~jgffcont~CONT~xxxxx there are TWO duplicate copyright lines -- both outdated. This should be replaced with the JGFF's standard SSI footer, file "/htdocs/jgff/footer.txt". Also, change the text on the button from "Send my request" to "Send my message". Also, see comments in the "General To Do List", Item #15, re "FTJP Stuff: "Contact Submitter" form". < http://www.jewishgen.org/projects/desc/GeneralToDo.txt >. When we implement Central User Registration Environment (CURE) < http://www.jewishgen.org/projects/desc/CURE.htm >, and require registration to SEARCH the JGFF, then the user's name, email address and other contact information should be automatically filled in on this form, and *not* be editable. 11/11/2002, 11/5/2003. ------------------------------------------------------------------- 66 - Show list/count of matches and additional potential matches. In the search results display page, show statistics about the matched entries. Also perhaps show statistics about additional *potential* matches, if other search types were to be used. This is an idea that I got from the new 5.0 Inktomi Enterprise Search Engine. When Inktomi displays the results of a search, it also displays some statistics about the results at the top of the page. See samples by trying out the Inktomi engine at < http://www.jewishgen.org/JewishGen/Search.htm >. For instance, if you search for "immigration and Liverpool", it will tell you: "immigration (135), Liverpool (26), immigration and Liverpool (8)", showing the number of hits for each search term. If you do a wildcard search in Inktomi, such as "b?rns*t*n", Inktomi will display the list of matching terms and the number of hits for each, e.g.: "barnstein (6) bernshtein (62) bernshteyn (11) bernstayn (4) bernstein (464) bernsteyn (3) birnstein (9) bornsteiin (2) bornstein (295) bornstien (2) bornsztain (8) barnshtein (1) barnston (1) barnsztajn (1) barnszteyn (2) bernshtayn (16) bernshtejn (1) bernshtin (1) bernstain (1) bernstejn (1) bernsten (1) bernstien (2) bernstyn (1) bernsztain (2) bernsztajn (19) bernsztein (16) bernsztejn (30) bernszteyn (1) bernsztyn (3) birnsztajn (2) birnsztein (1) birnsztejn (3) bornshtein (9) bornshteyn (3) bornstain (4) bornstajn (5) bornstayn (1) bornstejn (3) bornsten (2) bornszcztajn (1)". We could do the same for Soundex, Wildcard, and Partial Text searches in the JGFF: Show a summary list of matched entries and hit count for each, in descending frequency order, before showing the actual hits. This feature will also allow us to more easily spot problem entries amd other unusual entries. A further enhancement, if allowed by peformance considerations, would be to display to the user a list of the entries that *would be matched* by alternative search methods. For example, if a user does an Exact Spelling Match search for a name, also show a statistics report of the entries that could be found for the same input using D-M Soundex and Partial Text searches, showing the number of hits for each, hyperlinked. These would be hyperlinked, and clicking on the hyperlink would execute that alternate search. Perhaps this report could be at the bottom of the search results, as "suggestions". Ed Rosenbaum's "Belarus Surname Index" and "Galicia Surname Index" at < http://www.jewishgen.org/belarus/static_index.htm > and < http://www.jewishgen.org/Galicia/surdex/static_index.htm > contain this feature: For each surname, a hyperlinked list of the other surnames with the same Daitch-Mokotoff soundex code. Calculating the statistics for the alternate search methods might be an expensive operation to add to every search, so this function might have to wait for hardware or software upgrades. 11/14/2002 ------------------------------------------------------------------- 67 - Review and correct ShtetLinks documentation, write macro. The "How to Create a ShtetLinks Page" documentation at http://www.shtetlinks.jewishgen.org/documentation/htminstr.htm#jgff contains sample HTML code for providing a button to search the JGFF for a town. The sample code should be reviewed for accuracy, and updated and simplified if necessary. Perhaps we could also write a simple CGI script to encapsulate this code, to make it easier for users to implement a call to the JGFF, instead of having them call wconnect/wc.isa?jg~jgsys~jgff directly. We did a similar encapsulation for MapQuest. The use of an intermediate script also opens the possibility for us to add additional security mechanisms (See Item #28), as well as more easily change the interface in the future, without impacting the calling users. 11/22/2002 ------------------------------------------------------------------- 68 - Incorrect password entered - better feedback. If a user enters an invalid JGFF password, and selects the second or third option, the error messaage appears BELOW all of the town entry instructions. The instructions should not appear at all in this instance. 11/29/2002 ------------------------------------------------------------------- 69 - Standard Headers, Footers, and StyleSheet. Use standard external SSI headers and footers for ALL pages -- both static AND dynamic. Use and at the top and bottom of each file, respectively. Same for the CSS (Cascading Style Sheet) to be added to the section: This will make the application of global stylistic changes easier. 1/13/2003. ------------------------------------------------------------------- 70 - Another Display Option for user contact. The current user contact display options are: 1. Display my researcher code only (provides maximum privacy through a protected e-mail contact system) 2. Display my name and e-mail address (insures privacy to a degree) 3. Display my name, email address, and complete postal address (least degree of privacy) It has been suggested that we add a fourth level, a new #2 between current #1 and #2, which would display the user's Name and Researcher Code #, but NOT list their email address. Contact would be made via the Blind Contact System. This would encourage users to list their names, but still protect their email addresses. It would allow a researcher to know the name of the person to whom they are writing... and help them remember if they've contacted this person before. It has also been suggested that we eliminate or reword the parenthetical descriptions after each option (e.g. "provides maximum privacy..."), which might be encouraging newly registering researchers to elect the anonymous option (option #1, display researcher code only), which has its share of inconvenience and problems. 3/1/2003. ------------------------------------------------------------------- 71 - Administrative: Display all "new" entries for a country. SIG leaders have requested that they be able to search for "All researchers who have registered an interest in Belarus in the last two months" for example. Perhaps this could be better accomplished via GoldMine? 4/10/2003. ------------------------------------------------------------------- 72 - Integration with DNA Project. "Genealogy by Denetics" is DNA testing is offered by one of JewishGen's partners, Family Tree DNA (FTDNA). On the FTDNA web page, < http://www.jewishgen.org/dna >. it states: We will be integrating the FTDNA database library with existing JewishGen databases, providing untold numbers with the ability to connect with lost branches of their families. Bennet Greenspan of FTDNA says that he is ready to provide us with the 2,987 surnames in their database. One of the things we were going to do was identify which surnames in our JGFF have already had DNA tests done and appear in FTDNA's database. The JGFF is the most logical place to integrate this data, since it's our only surname-based database covering all regions. But there's not really an easy efficient way to integrate this FTDNA data, without slowing down the entire JGFF. I wouldn't recommend adding another column to the JGFF DATA table for this. I assume that the FTDNA data contains only surname, not town names. We could have a separate table of these 3K surnames, and search it whenever the JGFF is searched, using the matched JGFF results, and put an icon next to those surnames which are in the FTDNA database. Clicking on that icon will yield information about the FTDNA project. What should we do if a surname is in the FTDNA database but not in the JGFF database? Maybe just have two different sections in the search results page, and we DON'T integrate the two. That seems cleaner, actually. However, we need to be careful and not clutter up the JGFF with all sorts of distractions. We should not use the JGFF to promote FTDNA -- the focus needs to remain on the simple basics of the JGFF. People are distracted/confused enough already. 5/31/2003. ------------------------------------------------------------------- 73 - Add "Time of Last Contact". User suggestion, 30-Jun-2009: When researching a town, there is a column in the database named "Last Updated" that corresponds to the last time the line item (family name and researcher) was updated. In many cases, people have entered the surname along with contact information many years ago. However, it would be nice to know if the researcher who entered the information is still active, as there have been many times that I have tried to contact the researcher only to find out that the contact info is obsolete. Could you insert a field that might provide information as to the last time the researcher logged into the system (month/year would be fine)? This way we would know that the researcher was active. An alternative would be to determine the last time the researcher logged in (let's say within the last 6-12 months) and just show "Active." This way we reduce any privacy concerns. ------------------------------------------------------------------- ------------------------------------------------------------------- ------------------------------------------------------------------- To Do List - Prioritization: ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here is my quick triage list of priorities of the above items, as of 10/28/2002. Some placements are a little arbitrary... HIGH: #1 - Town Name cleanup / WOWW2 sync #1a, #1b, #1c all tied together... and Closely related items: #7, #8, #35, #37, #47, #50, #56, #57, #58, #63. #3 - US States -- Enforce two-letter state/province code. #26 - Better duplicate researcher detection. #27 - Town name validation for places outside Eastern Europe. #27a - Town name cleanup for countries outside Eastern Europe. #29 - Integration into the "All Country" Databases. #39 - Better error feedback for Address Entry/Edit screen. #40 - Re-indexing automation. #42 - Automated Password Retrieval #44 - Renewal Notices. #45 - Researchers with no data - better "Modify" interface. MEDIUM: #2, #2b - Wildcard character restrictions, documentation. #18 - "Email Details are Suspect" message removal. #22 - Contact all researchers for a town. #28 - Security. #31 - Documentation for JGFF FAQ. #51 - Integration with the "JewishGen Family Links Database". #54 - User "NOTE!" and Auto-alert settings fields. #55 - Country list consistency. #61 - List JGFFAlert status info on Researcher Info page. #66 - Show list/count of matches and additional potential matches. LOW: #4 - Accents #5 - Display Results in Batches #5a - Upper limit on total number of matches. #6 - Display all of a fellow researcher's entries. #13 - Internal: Links on "Search JGFF Researcher Names" page. #14 - Printed version the JGFF. #16 - Multiple Button press suppression. #21a - Enhancements to JGFFAlerts facility. #23 - Internationalization. #30 - "Another Search" link incorrect. #32 - Create a way to temporarily "hide" a researcher. #33 - Provide an "heir" for deceased researchers. #34 - A tutorial. #36 - Remove visible passwords from URLs. #38 - Town sponsorship banners. #41 - Testimonials. #46 - Sort order -- Display results in a user-selected order. #53 - User deletion. #59 - Enhance the statistics report. #60 - Impose a limit on the number of surname/town entries per user. #62 - When "No matches found", offer suggestions. #65 - "Blind Contact" page - fix footer. #67 - Review and correct ShtetLinks documentation, write macro. #72 - Integration with DNA project. DIFFICULT / EXPENSIVE / LONG-TERM: #9 - Proximity Searches #12 - Disallow/restrict spaces in the Surname field. #19 - Upgrade to the latest versions VFP / WebConnect. #20 - Ideas about using virtual disks (RAM disks). #24 - Email address validation. #49 - Prevent user entries with no Surname/Town data. STILL NEEDED?: #25 - Merge users -- Confirmation page, and Dupe deletion. #43 - Administrative notification of researcher address changes. COMPLETED: #2a - Bugs in Wildcard Searches #10 - Display the Researcher Code number more prominently #11 - Break the Researcher Address info into more specific fields. #15 - "Any Country" town entries -- clean up. #17 - JGFFHelp Form. #21 - JGFF Alerts facility. #64 - "Show all my matching surname/town entries" function. ------------------------------------------------------------------- Warren Blatt