/* htmLawed_TESTCASE.txt, 29 June 2008 htmLawed 1.1, 29 June 2008 Copyright Santosh Patnaik GPL v3 license A PHP Labware internal utility - http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed */ This file has UTF-8-encoded text with both correct and incorrect/malformed HTML/XHTML code to test htmLawed (test cases/samples). The entire text may also be used as a unit. ************************************************ when viewing this file in a web browser, set the character encoding to Unicode/UTF-8 ************************************************ --------------------- start -------------------- Try different $config and $spec values. Some text even when filtered in will not be displayed in a rendered web-page
Attributes
Xml:lang:, ,
Standard, predefined value, or empty attribute: , ,
Required: , image
Quote & space variation: a, a, a
Invalid: a
Duplicated: a
Deprecated: a,

Casing:
Admin-restricted?:
Attribute values
Duplicate ID value:, ,
(try 'my_' for prefix)
Double-quotes in value:, ,
(try filter for CSS expression)
CSS expression:

Other: ,
(try 'maxlen', 'maxval', etc., for 'input' in '$spec')
Blockquotes
abc

abc
def

abc
def

abc
def
ghi

abc
def
ghi
(try with blockquote parent)
CDATA sections
Special characters inside: ]]>, 3.5, & 4 > 4 ]]>
Normal: , CDATA follows:
Malformed: , < ![CDATA check ]]>, , < ![CDATA check ] ]>
Invalid: >CDATA in tag content,
text not allowed
Complex-1: deprecated elements
The PHP software script used for this web-page webpage is htmLawedTest.php, from PHP Labware.
Complex-2: deprecated attributes
aa

image

Section

Para

  1. First item
  1. First item

Complex-3: embed, object, area


navigate the site: 1 | 3 | 4

Complex-4: nested and other tables
Cell
Cell
Cell
Cell Cell Cell
Cell
Cell Cell Cell

PCDATA wrong: Well
Hello

Missing tr:
Well

Complex-5: pseudo, disallowed or non-HTML tags
(Try different 'keep_bad' values) <*> Pseudotags <*> Non-HTML tag xml

Disallowed tag p

Elements
Unbalanced: check
Non-XHTML:

Malformed: < a href="">, , , , < /a>, < a href="">, a, a,
Invalid: a
Empty: a, a, atext
Content invalid: 12
Content invalid?:

(try setting 'form' as parent) Casing:
Entities
Special: & 3 < 2 & 5>4 and j >i >a & ia
Padding: B B f f  
Malformed: & #x27;, &x27;, ' &TILDE;, &tilde
Invalid: , �, , �, ￿, &bad;
Discouraged characters: , „, ﷠, 􏿾
Context: '>', <?
Casing: ', ', &TILDE;, ˜
(also check named-to-numeric and hexdec-to-decimal, and vice versa, conversions)
Format
Valid but ill-formatted: text text text text
p r e
text text

text none text text none t e x t
text none t e x t text none t e x t
p r e  
				pre
		
Cell
Cell
Cell
CellCellCell
Cell
CellCellCell
(try to compact or beautify)
Forms
(note nesting of 'form', missing required attributes, etc.)
pl
h


B:C:

(try each of these lines separately)
what
what (try with container as div and as form)
c a b
HTML comments (also CDATA)
Special characters inside: , , , c
Normal: , , comment:,
text not allowed

Malformed: , < ![CDATA check ]]>, < ![CDATA check ] ]>
Invalid: >comment in tag content,
Ins-Del
(depending on context, these elements can be of either block or inline type)

block


d


d

d

d
Lists
Invalid character data:
  • (item
  • )

dd/dl/dt: dd/dl/dt:
a
bad
first one
b
second

Complex:
Non-English text-1
Inscrieţi-vă acum la a Zecea Conferinţă Internaţională
გთხოვთ ახლავე გაიაროთ რეგისტრაცია
večjezično računalništvo
Зарегистрируйтесь сейчас на Десятую Международную Конференцию по
(this file should have utf-8 encoding; some characters may not be displayed because of missing fonts, etc.)
Non-English text-2: entities
用统一码
გთხოვთ
Inscreva-se agora para a Décima Conferência Internacional Sobre O Unicode, realizada entre os dias 10 e 12 de março de 1997 em Mainz na Alemanha.
Ruby
(need compatible browser)
さい とう のぶ W3C Associate Chairman
WWW (World Wide Web)
A (aaa)
URLs
Relative and absolute: , , , , , ,
(try base URL value of 'http://a.com/b/')
CSS URLs:
,
,
,
,

Anti-spam: (try regex for 'http://a.com', etc.) , , , , , ,
XSS
'';!--"=&{()}









x


test
Other
3 < 4
3 > 4
> 3