Biocrawler talk:Verbatim copying
From Biocrawler, the free encyclopedia.
Biocrawler and Biocrawler users do not give legal advice
Does this mean if one publishes more than 100 copies they must also maintain a mirror web site of the entire Biocrawler site if the URL at Biocrawler is to be considered unstable? Is that really reasonable? Why would the article disappear and if it was moved wouldn't the previous name be left as a redirect? Of course if Biocrawler disappeared that could create a problem, but is that a possibility? Just a thought. — Alex756 21:51, 30 Nov 2003 (UTC)
- Thanks for reading. :)
- An article would disappear because it was deleted, which happens quite often - check the wikipedia:deletion log for an idea as to frequency - it's perhaps 100 a day, roughly? I lost count. When articles are moved, typically the redirects are left behind, but sometimes the redirects are deleted, or turned into disambiguation pages, etc. Still, one only has to be "reasonably prudent", so maybe these things aren't a problem?
- If you don't want to rely on Biocrawler, you don't have to maintain a complete mirror of it. The simplest option for verbatim copiers would be to download the database and rehost that. I'll add some details to the article. Martin 19:27, 1 Dec 2003 (UTC)
Removed from article:
- === History Subunit ===
- As the GFDL was never intended for wiki articles, things get complicated. Some allege that the "page history" link is the history subunit and that you should include this "by reference" by linking to it.
- However, the "page history" is not in the correct format for the GFDL, is missing information that would be required under the GFDL if it were the history subunit, and is not "Entitled" History. Further, there has been no official declaration from either Bomis or the Wikimedia foundation that the "page history" is intended to represent the history subunit. Further, it is possible to download the Biocrawler database without downloading the full history.
- As a result, some allege that you may ignore the "page history" for the purposes of verbatim copying. If the text includes a section Entitled "History", then you should of course copy that along with the rest of the body text.
- I'm not sure what this is trying to get at. If you copy an article under the GFDL, verbatim or otherwise, you have to preserve the "history" section (whatever it might be called). On Biocrawler it's called "Page history"; it was given this title precisely because that's the name mentioned in the GFDL. If you copy the document, you have to preserve the history.
- All that's pretty clear from the GFDL. The above paragraphs don't seem to be interpreting the GFDL in a meaningful way. I don't think they add anything useful; I'd be interested to hear other people's views. Enchanter 00:14, May 16, 2004 (UTC).
- In fact, "Page history" is not the title mentioned in the GFDL. The GFDL says "History". Please re-read the text of the GFDL. One of the relevant sections is 4I:
- I. Preserve the section Entitled "History"...
- Thus, it is disputable whether Biocrawler's "page history" is supposed to be the GFDL "History" subunit, as the Title is wrong, and for the other reasons given in the paragraphs above. The paragraphs explain the dispute (though apparently not clearly enough), and point out the two differing interpretations that people have taken on the subject. Martin 22:32, 16 May 2004 (UTC)
- As far as I am aware, the page history has always been intended to mean the history section of the GFDL. When the original Biocrawler software was written, the link was called "history" for precisely this reason (the link was subsequently changed to "page history", presumably for clarity). There was quite a bit of fuss when the software was introduced because the page histories weren't always preserved, which meant that we weren't fully GFDL compliant.
- Furthermore, if the page history section isn't considered as part of the document, that means that wikipedia is not GFDL compliant. The history section is not optional.
- I'm concerned that this page is giving too much credence to what I see as a marginal and tenuous argument. Enchanter 18:28, May 17, 2004 (UTC)
- Well, pages such as the wikipedia:GFDL History that Cunctator created suggests that, right from the start, this issue was unclear. As far as I know, The Cunctator never got a definitive response on this. I suspect that this is deliberate - Jimbo doesn't want to leave any hostages to fortune, especially with the possibility of renegotiating the GFDL in the future.
- That page shows that one individual thought the issue was unclear two years ago; it doesn't reflect the outcome of any debate or decision. As far as I'm aware, Cunctator never got a definitive response on this because he didn't even press the question - he let the matter drop.Enchanter
- So do we agree then, that neither of us know whether or not the "page history" was intended to mean the history section of the GFDL, or whether it was intended simply as an editorial tool? Martin 00:35, 26 May 2004 (UTC)
- I agree that the original intention is unclear; the link was originally put in by Magnus Mansk when he wrote the software, and I doubt he thought very hard about these issues. At the time, there were lots of ways in which our GFDL compliance was not very good; for example, we didn't have a proper copyright notice. There is some ambiguity, and I think it would be nice to resolve it in some way. Enchanter 22:56, May 28, 2004 (UTC)
- The history section is optional, in this sense - you can write an essay, have no history section, and place that essay under the GFDL. It is only if you create modified versions, and only have permission to do so from the GFDL, that you require a history section. For example, if I'm the sole author of a document I've released under the GFDL, I can modify it, release the modified version, and not have to put in a history. As Biocrawler is a collaborative work, and most articles are created solely on Biocrawler, they may fall into this category. Or they may not.
- Copyright on Biocrawler articles is generally retained by the contributor. That means that when you modify an article, you do not hold the copyright to the version you are modifying, so the only way you can create a derived work is by following the provisions of the GFDL for creating a derived work. That applies to more or less every edit on Biocrawler where an article has been modified by more than one person.Enchanter
- Yes, copyright is maintained by the contributor. When you modify an article, you do not hold the copyright to the version you are modifying, so the only way you can do so is if the copyright holder has granted you a license. Fortunately, such a implicit license exists - the edit box warning that Biocrawler users will mercilessly edit the submitted content, and the result (as with all contributions) will be released under the GFDL, backed by the implicit license arising from contributing to a collaborative encyclopedia. As Biocrawler is a collaborative work, this interpretation is quite natural, maybe even "common sense". Martin 00:35, 26 May 2004 (UTC)
- I think arguing for such an implicit license is a bit of a stretch. For a start, when the warning that Biocrawler contributions would be 'mercilessly edited' was written, it wasn't with this interpretation in mind. There was no discussion at the time along the lines of the one that we are having. So I think what you are arguing for is a creative reinterpretation of something that someone wrote, which is not in line with what the writer meant.
- Another issue is that the page history stores the history of the authors of the document. Whether or not we give an implicit licence for others to edit the work without invoking the GFDL, I'm not sure that we give an implicit licence for the authorship information not to be preserved. It could be argued that it is important for both the letter and spirit of the GFDL that this information about who wrote what is kept. Enchanter 22:56, May 28, 2004 (UTC)
- If a GFDL History is required, then Biocrawler is not GFDL compliant, as the "page history" does not meet the requirements set out in the GFDL, section 4I. This is the case regardless of whether or not one considers the "page history" to be included by reference in the article.
- Just what is the problem with the requirements set out in section 4I? The GFDL asks that the "title, year, authors, and publisher" are given. Year and authors are given; the title is the same for each revision, so it's obvious; similarly it's obvious that the publisher is Biocrawler, unless explicitly given otherwise in the edit summary text. So I don't see anything seriously wrong with the format.Enchanter
- See below. Martin 00:35, 26 May 2004 (UTC)
- Personally, I have grown resigned to the fact that any interpretation of the GFDL arising from actually reading it are either tenuous, result in Biocrawler being non GFDL-compliant, are hideously impractical, or a combination of all three. However, I do understand your concern about emphasis, so I will edit that section of the page. Martin 18:39, 25 May 2004 (UTC)
- Agreed, interpreting the GFDL is not for the faint hearted!Enchanter
To clarify my concerns about the page history, this is my response to the reasons given in the article, that the Biocrawler page history:
- ... is not in the correct format
- The GFDL asks that the "title, year, authors, and publisher" are given. Year and authors are given; the title is the same for each revision, so it's obvious; similarly it's obvious that the publisher is Biocrawler, unless explicitly given otherwise in the edit summary text. So I don't see anything seriously wrong with the format. Enchanter
- The title is not the same for each revision, as articles can change title. I don't believe the GFDL provides an exception for things that are "obvious". It says "for each item, state the publisher". The history doesn't do that. Also, it doesn't distinguish between "authors" and "new authors". Martin 00:35, 26 May 2004 (UTC)
- Again, I agree that Biocrawler is not fully compliant; I don't agree that this warrants the conclusion that the page history does not represent the GFDL history. Enchanter 22:56, May 28, 2004 (UTC)
- ... is missing required information
- You can give whatever information you like in the edit summary; what is "missing"? Enchanter
- My bad - I confused myself when I added this. Removed. Martin 00:35, 26 May 2004 (UTC)
- ... is not "Entitled" History (though it was in the past)
- Saying that the history section isn't the history section because the title says "Page history" or "Revision history" rather than "history" seems a bit silly to me, and not a strong argument either in legal terms or according to common sense.Enchanter
- I'm happy for third parties to judge the strength or weakness of the argument, and act accordingly. Martin 00:35, 26 May 2004 (UTC)
- I don't think that that is adequate. If you're using something under license, and there's ambiguity about how the license is to be interpreted, the natural response is to ask the owner of the materials you are using (the Biocrawler contributors) what they think - after all, they are the people who could sue you! I think Biocrawler ought to have a view on this, and that this view should be reflected in the copyright pages. Enchanter 22:56, May 28, 2004 (UTC)
- ... is incomplete in many articles, notably those that have been merged or split, in very old pages, in pages moved or translated between wikimedia projects, or in pages sourced in part from external GFDL sources.
- Agreed, the history isn't merged as the GFDL says. But in general, the edit comments should give enough information to give a proper history (admittedly, assuming people fill in the box fully!)Enchanter
- Yes, a proper history could in some cases be reconstructed in theory. That doesn't invalidate the point. Martin 00:35, 26 May 2004 (UTC)
- I agree the point that the page history is not fully compliant with the GFDL history. I don't agree that because it is not fully compliant, you could reasonably conclude that it's not actually the history at all, as defined by the GFDL. Enchanter 22:56, May 28, 2004 (UTC)
- It is possible to download the Biocrawler database without downloading the full history.
- So what? You can download any combination of pages you like; it doesn't change how it is licensed.Enchanter
- It would be odd to make a download available which, were anyone to take advantage of that download, they would be in violation of the GFDL, as it was not a verbatim copy. Martin 00:35, 26 May 2004 (UTC)
- The download can be used to produce another site, which effectively includes the history by linking back to Biocrawler - which is what we have always suggested that people reusing Biocrawler data do. Enchanter 22:56, May 28, 2004 (UTC)
There are various interpretations. The interpretation you favour is more conservative, but also less practical. My favoured alternative interpretation is more questionable from a legal POV, but also more practical. A more conservative interpretation still is that Biocrawler is systematically non-compliant, and no legal copying is possible. Third parties have to choose which interpretation they favour, weighing legal risk against other factors. Martin 00:35, 26 May 2004 (UTC)
- I don't think this is just a matter to leave to third parties - it has relevance for all Biocrawler contributors. For example, my interpretation requires that the information about who authored what is preserved, whereas yours implies that contributors give up their right to be credited with their contributions under the GFDL.
- Also, we want other people to be able to reuse our material with a minimum of fuss - leaving this unresolved makes understanding how to reuse our material more difficult. Enchanter 22:56, May 28, 2004 (UTC)
All I want is for this page to tell the truth, no matter how inconvenient that may be. The truth is that currently there are some uncertainties regards what is required for a verbatim copy of a Biocrawler article to be compliant. If you manage to change the truth, I'll be happy for the article to change to reflect that. Martin 00:45, 29 May 2004 (UTC)
To make one thing clear - I know that both interpretations have problems. Both of their problems could be patched up in various ways, both technically and legally. My problem with your favoured interpretation is that including the entire history verbatim, even in printed copies, is incredibly impractical, and I suspect not what most contributors actually want. It seems to me that the desire is rather one of (a) author credit and (b) requiring a link back. Fortunately, both of these things can be done in my interpretation, and I hope this will be done in a future version of mediawiki. Martin 23:01, 29 May 2004 (UTC)
| Contents |
Changes and removals
edit summary: "some changes and removals to clarify what the GFDL says"... yeah, I'd like more explanation than that, please.
Some content regards the handling of links - removed the two legally questionable exceptions. Why? Section two just says "verbatim copying". Verbatim means "word for word", so it should be entirely possible to change formatting details like links while still making a verbatim copy. This is what happens when I take a print out of a Biocrawler page, for example.
"Biocrawler's current License and Copyright Statement is itself questionable" - just removed. Well, it is questionable, certainly. Would you like me to explain why?
"Embedding a Biocrawler Document within a larger webpage" - removed "legally questionable". Well, it is questionable - the question is whether such embeds are considered "aggregation" or "derivative works". That question would have to be resolved in court - there have been some cases around this subject, IIRC, but nothing conclusive.
"Relying on Biocrawler is legally questionable because pages on Biocrawler may be deleted from public view" - removed. Well, why? This is just true - some pages on Biocrawler are deleted, and to the extent that this happens, Biocrawler will not be a machine-readable copy of the distributed text, and thus the sublicensee will not be in compliance with the text of the GFDL. So, it's questionable. People can do this, but they need to be aware that there may be issues with doing so. Martin 22:47, 16 May 2004 (UTC)
Images
As we have a formal statement from Jimbo that the aggregation interpretation is go, I've used this to draft a couple of sections.
We also want buy in from our contributors, so in due course such a statement should go into wikipedia:Submission Standards. This will allow redistributors to rely a little more confidently on that interpretation. Martin 13:01, 19 Jun 2004 (UTC)
Are we serious?
I have a small static site called [fixed reference (http://fixedreference.org)] which I am trying to keep compliant with the various developments in interpretation of licence, but I have to say that the history is the most problematic. In principle I don't see why this discussion assumes the edit history contains adequate history in evolution of the article: a lot of the text is built by team on the talk pages as well: do we need local copies of all versions of all those to be compliant? I for one do not want future users of my work to have to carry hundreds of pages of discussion and versions around before they are allowed to use it and trying to include all this in, say, a CD version for schools is ridiculous: so what do we do? What if there is indecent material in the discussion pages: are we legally obliged to include it in a version for five year olds? Britannica must be laughing all the way to the bank. I would take an aggressive line that stating the content comes from WikiPedia and that WP will prvide the next step of information up the chain. This is the only workable route for this project to have any value to anyone. 99% of it comes under the additional agreement submitters have accepted "If you do not want your writing to be edited mercilessly and redistributed at will, do not submit it." in any case. --BozMo|talk 09:08, 12 Aug 2004 (UTC)
Title page
The section Title page does not really say if the title page (From Biocrawler, the free encyclopedia) should exist in the verbatim copy. I'd prefer not because the text is quite naïve and changing over the time. (Actually, I am not doing any verbatim copies but the question is interesting) --Etu 02:02, 5 Dec 2004 (UTC)
I think that you do not have to include the the title page. Brianjd 09:57, 2004 Dec 27 (UTC)
- I think verbatim copy is copy of the whole document, and anything that is part of the licensed document should be kept the same.
- Besides, this page currently says that the Title Page is below title, but it is near the title. See the first section of the license where terms, including Title Page is defined. Tomos 04:03, 28 Dec 2004 (UTC)
License and Copyright Statement
License and Copyright Statement - currently, "All text is available under the terms of the GNU Free Documentation License (see Copyrights for details). Disclaimers.".
Note the link to GNU FDL, which redirects to GNU Free Documentation License. However, the actual license and copyright statement links to Biocrawler:Text of the GNU Free Documentation License.
Webpage copies
Title
From Biocrawler:Verbatim copying:
- You may not change the Title. For example, in this document, the Title is Biocrawler:Verbatim copying.
It's my understanding of the GNU Free Documentation License (GFDL) that we must change the title unless the original publisher of "a" (does that mean "any"?) previous version gives permission to do use the same title as that version, in which case it is up to us.
In summary: The Wikimedia Foundation cannot make that restriction. Brianjd 06:38, 2004 Dec 22 (UTC)
- In case of verbatim copying, I think everything has to be kept, including the title. In case of modification, title has to be changed, because of the provision 4-A. So I take this part as simply rephrasing what verbatim copy means in the context of GFDLTomos 03:59, 28 Dec 2004 (UTC)
- You seem to be saying that copying is verbatim if and only if the entire Document (including the Title) is the same. Then if we change the title, we have created a Modified Version, and everything will be fine if and only if we follow section 4. Brianjd 10:10, 2004 Dec 29 (UTC)
Title Page
- The "Title Page" is the text just below the title, before the start of the article proper. This is currently "From Biocrawler, the free encyclopedia", but was previously (briefly) "Find out how you can help support Biocrawler's phenomenal growth", etc.
If I remember correctly, then on the MonoBook skin, for a short while, there was no "Title Page".
- Where Biocrawler has imported text from a third party, such as Nupedia, the Title Page may additionally extend to some italicised block text immediately after the title, and before the start of the main text.
That seems a bit vague. I would expect at least a link to an example of such an article ("Document" according to the GFDL). Brianjd 06:38, 2004 Dec 22 (UTC)
- I looked at some articles linking to Nupedia, and italicized texts were after the main text. I suppose that is technically not very good, though it is hard to miss that piece of text anyway. [1] (http://en.wikipedia.org/w/index.php?title=Special:Whatlinkshere&target=Nupedia) Tomos 10:57, 29 Dec 2004 (UTC)
History Section
As the GFDL was never intended for wiki articles, things get complicated.
I don't think the GFDL was ever intended for use by non-lawyers. The GFDL already seems a lot more complicated than it needs to be. Brianjd 06:38, 2004 Dec 22 (UTC)
Some interpret the GFDL as applied to Biocrawler such that the "history" link (known as the "page history" in non-standard skins) is the GFDL History Section.
"Non-standard skins"? Can't we just say "some skins"? And I don't think that the page pointed to by the "history" link - not the "history" link itself of course - is the GFDL History section, because the GFDL History section needs to be a "section" of the "Document" in the usual sense, and the Biocrawler history is not. Brianjd 06:38, 2004 Dec 22 (UTC)
There are some issues with this interpretation, making it legally questionable: the "page history"...
- (3 other points)
- It is possible to download the Biocrawler database without downloading the full history.
I would rather see a more consistent wording:
- ...is not always included when the Biocrawler database is downloaded. Brianjd 06:38, 2004 Dec 22 (UTC)
License and Copyright Statement
Here you have more freedom, chiefly because Biocrawler's current License and Copyright Statement is itself questionable.
Well I hope they're doing something about it then! Brianjd 06:38, 2004 Dec 22 (UTC)
You must link to a local copy of the GFDL.
See Biocrawler talk:Text of the GNU Free Documentation License#Why is it on Biocrawler? for some more discussion regarding this point. I note there that the Wikimedia Foundation's website ([2] (http://wikimediafoundation.org/)) doesn't seem to have a local copy of the GFDL - the links at the bottom of pages point to http://www.gnu.org/copyleft/fdl.html - and here that Biocrawler is the only wiki I know that has a local copy of the GFDL. Brianjd 06:38, 2004 Dec 22 (UTC)
- It sounds like then, Biocrawler is following GFDL better than other wikis and Wikimedia Foundation. That's good. Section 6 of the GFDL says that in a collection of GFDL documents, GFDL licenses in individual works could be replaced by just one copy. Tomos 04:11, 28 Dec 2004 (UTC)
- It sounds like then, Biocrawler is following GFDL better than other wikis and Wikimedia Foundation.
Agreed, but they are still not doing enough. - Section 6 of the GFDL says that in a collection of GFDL documents, GFDL licenses in individual works could be replaced by just one copy.
How is this relevant? Brianjd 10:03, 2004 Dec 29 (UTC)
- It sounds like then, Biocrawler is following GFDL better than other wikis and Wikimedia Foundation.
- As far as having a copy of full license-text, Biocrawler is doing perhaps all it needs to do. If we regard Biocrawler as a collection of independent documents, that is. We have a full text in our collection, and it is linked from the bottom of each content page. Or you have a different take? Tomos 10:49, 29 Dec 2004 (UTC)
Printed copies
If you accept this interpretation, then you would need to print out the entire "page history" of the article (you may need to URL-hack to get the entire page history on a single webpage).
This is unacceptable. I don't think this needs much explanation. Brianjd 06:38, 2004 Dec 22 (UTC)
- I'm confused. Say I want to collect a number of entries and compile them into a academic study book for my church group. This book would be made available for purchase on a printing service like LuLu. Would I be required to print the complete text of the "history" tab for each entry? As I understand the discussion above, that would make printed reproduction of any wikipedia entry nearly impossible. We can't have 1,000 line entries for each page of definition. Obviously I donn't want to violate any user obligations.
In any event, your ability to redistribute Biocrawler articles beyond the terms of the GFDL are extended (at least in the United States) by the fair use and first-sale doctrines.
I don't know much about those but I presume that they are subject to change. Why are we singling out the United States here? Brianjd 06:38, 2004 Dec 22 (UTC)
- I think it is okay to add non-US examples. But there seems to me an uncertainty what the relevant law would be when a user outside of the US uses Biocrawler contents and Wikimedia Foundation finds it a breach of the license. Always U.S. law? Depends on where the breach happens? I don't know. But it seems there is a fair chance that U.S. law is the strongest candidate for the governing law in case of some dispute. And that is perhaps why. I hope I knew better. Tomos 10:54, 29 Dec 2004 (UTC)
Copying in quantity
If you are copying more than 100 copies, then section 3 (copying in quantity) comes into play. Chiefly this means that you must include either:
- (2 other points)
- (legally questionable) A URL for Biocrawler - note that "(legally questionable)" was originally in italics
Relying on Biocrawler is legally questionable because pages on Biocrawler may be deleted from public view, or Biocrawler as a whole may be taken down for legal or financial reasons. However, it is clearly the easiest solution, and Biocrawler is pretty stable.
Is this really specific to Biocrawler? This doesn't seem to be consistent with the GFDL, in spirit at least. Brianjd 06:38, 2004 Dec 22 (UTC)

