forked from CMB/edbrowse
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
330 lines (278 loc) · 12.9 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
edbrowse, a line oriented editor browser.
Maintained by Chris Brannon, [email protected]
Written and maintained by Karl Dahlke, [email protected]
See COPYING for licensing agreements.
------------------------------------------------------------
Disclaimer: this software is provided as-is,
with no guarantee that it will perform as expected.
It might trash your precious files.
It might send bad data across the Internet,
causing you to buy a $37,000 elephant instead of
$37 worth of printer supplies.
It may delete all the rows in your mysql customer table.
Use this program at your own risk.
------------------------------------------------------------
Netscape and Explorer are graphical browsers.
Lynx and Links are screen browsers.
This is a command line browser, the only one of its kind.
The user's guide can be found in doc/usersguide.html (in this package).
Of course this reasoning is a bit circular.
You need to use a browser to read the documentation,
which describes how to use the browser.
Well you can always do this:
cd doc ; lynx -dump usersguide.html >usersguide.txt
This produces the documentation in text form,
which you can read using your favorite editor.
Of course we hope edbrowse will eventually become your
favorite editor, whence you can browse the documentation directly.
The doc directory also includes a sample config file, and a script to help you
set up your own, customized config file.
Run setup.ebrc and answer the questions.
(This assumes you are using bash.)
------------------------------------------------------------
Ok, I'm going to assume you've read the documentation.
No need to repeat all that here.
You're here because you want to compile and/or package the program,
or modify the source in some way. Great!
I haven't developed a configure script, and I should do that some day,
but meantime just type make and hope for the best.
As you may know, edbrowse was originally a perl script.
As such, it was only natural to use perl regular expressions for
my search/substitute functions in the editor.
And once you've experienced the power of perl regexp, you'll never
go back to ed. So I use the perl compatible regular expression
library, /lib/libpcre.so.0, available on most Linux systems.
If you don't have this file, check your installation disks.
the pcre and pcre-devel packages might be there, just not installed.
You need version 8.10 or higher.
Note that my files include <pcre.h>.
Some distributions put it in /usr/include/pcre/pcre.h,
so you'll have to adjust the source, the -I path, or make a link.
Yeah I know, that's something a configure script should do for you,
but I'm just not there yet.
You also need libcurl and libcurl-devel,
which is included in almost every Linux distro.
This is used for ftp, http, and https.
Check for /usr/include/curl/curl.h
Edbrowse requires version 7.29.0 or later. If you compiled with a version
prior to 7.29.0, the program will inform you that you need to upgrade.
If you have to compile curl from source, be sure to specify
--ENABLE-VERSION-SYMBOLS (or some such) at the configure script.
Finally, you need the Spider Monkey javascript engine from Mozilla.org
ftp://ftp.mozilla.org/pub/mozilla.org/js/
Edbrowse 3.5.1 and higher requires Mozilla js version 2.4.
If this is not available in your distribution, download into /usr/local,
expand, cd js/src, ./configure, make, and make install.
https://ftp.mozilla.org/pub/mozilla.org/js/mozjs-24.2.0.tar.bz2
Lots of warning messages in the build, don't worry about those.
This creats /usr/local/lib/libmozjs-24.so, and with /usr/local/lib
already in my /etc/ld.so.conf, I thought I'd be fine,
but I had to run ldconfig again anyways, not sure why.
Back in your edbrowse/src directory,
you may need to set some flags on the command line
if you had to install Mozilla js as above.
The following works for me.
make JS_CXXFLAGS=-I/usr/local/include/mozjs-24 JSLIB=-lmozjs-24 edbrowse
If you want database access, you need unixODBC and unixODBC-devel.
Make edbrowse-odbc, rather than edbrowse.
ODBC has been very stable for a long time.
unixODBC version 2.2.14 seems to satisfy edbrowse-odbc.
------------------------------------------------------------
The code in this project is indented via the script Lindent,
which is in the tools directory, and is taken from the Linux kernel source.
In other words, the indenting style is the same as the Linux kernel.
If you modify some source, you may want to run it through
../tools/Lindent before the commit.
------------------------------------------------------------
Debug levels:
0: silent
1: show the sizes of files and web pages as they are read and written
2: show the url as you call up a web page,
and http redirection.
3: javascript execution and errors.
cookies, http codes, form data, and sql statements logged.
4: show the socket connections, and the http headers in and out
5: messages to and from the javascript process, url resolution
6: show javascript to be executed
7: reformatted regular expressions, breakline chunks, read from js.
8: text lines freed, debug garbage collection
9: not used
------------------------------------------------------------
Sourcefiles (in the src directory).
main.c:
Read and parse the config file.
Entry point.
Command line options.
Invoke mail client if mail options are present.
If run as editor/browser, treat arguments as files or URLs
and read them into buffers.
Read commands from stdin and invoke them via the command
interpreter in buffers.c.
buffers.c:
Manage all the text buffers.
Interpret the standard ed commands, move, copy, delete, substitute, etc.
Run the 2 letter commands, such as qt to quit.
stringfile.c:
Helper functions to manage memory, strings, files, directories.
Hides many of the OS specific quirks.
url.c:
Split a url into its components.
Decide if it's a proxy url.
Resolve relative url into absolute url
based on the location of the current web page.
format.c:
Parse html tags and comments.
Translate common unicode sequences.
Arrange text into lines and paragraphs.
http.c:
Send the http request, and read the data from the web server.
Handles https connections as well,
and 301/302 redirection.
Uncompresses html data if necessary.
html.cpp:
Render the html tags and format the text.
Update input fields.
Submit/reset forms.
cookies.c:
Send and receive cookies. Maintain the cookie jar.
auth.c:
Remember user and password for web pages that require authentication.
Only the basic method is implemented at this time.
sendmail.c:
Send mail (smtp or smtps). Encode attachments.
fetchmail.c:
Fetch mail (pop3 or pop3s). Decode attachments.
Browse mail files, separate mime components.
messages.h:
Symbolic constants for the warning/error messages of edbrowse.
messages.c:
Strings corresponding to these error messages,
in various languages.
Also some international print routines to display the correct string
according to your locale.
jseng-moz.cpp:
The javascript engine, built around the mozilla js library.
Manage all the js object corresponding to the web page in edbrowse.
This is a stand alone process.
The resulting program is edbrowse-js,
and it must be in the same path as edbrowse.
All the js details are hideen in this file.
It might be possible, for instance, to build jseng-v8.c, based upon v8.
startwindow.js:
Javascript that is run at the start of each session.
This creates certain support methods that client js will need.
It is converted into a const string in startwindow.c,
thus startwindow.c is not a source file.
As you write functions to support DOM,
your first preference is to write them in startwindow.js.
Failing this, write them in C, using the api in ebjs.c.
Failing this, and as a last resort, write them as native code in the js engine.
ebjs.c:
Interface between edbrowse and edbrowse-js.
Sends interprocess messages to js to manipulate the js objects,
and returns the result back to edbrowse.
Think of edbrowse-js as the js server, and edbrowse as the client.
jsrt:
This is the javascript regression test for edbrowse.
It exercises some of the javascript DOM interactions.
dbops.c:
Database operations; insert update delete.
dbodbc.c:
Connect edbrowse to odbc.
dbinfx.ec:
Connect edbrowse directly to Informix.
------------------------------------------------------------
Error conventions.
Unix commands return 0 for ok, and a negative number for a problem.
Some of my functions work this way, but many return
true for success and false for error.
The error message is left in a buffer, which you can see by typing h,
in the /bin/ed style.
Sometimes the error is displayed no matter what,
like when you are reading or writing files.
error messages are created according to your locale, i.e. in your language,
if a translation is available.
Some error messages in the database world have not yet been internationalized.
Some are beyond my control, as they come from odbc or its native driver.
------------------------------------------------------------
Multiple Representations.
A web form asks for your name, and you type in Fred Flintstone.
This piece of data is part of your edbrowse buffer.
In this sense it is merely text.
You can make corrections with the substitute command, etc.
Yet it is also carried into the html tags in html.c,
so that it can be sent when you push the submit button.
This is a second copy of the data.
As if that weren't bad enough, I now need a third copy for javascript.
When js accesses form.fullname.value, it needs to find,
and in some cases change, the text that you entered.
These 3 representations are "separate but equal",
there is a lot of software to keep them in sync.
Remember that an input field could be an antire text area,
i.e. the text in another editing session.
When you are in that session, composing your thoughts,
am I really going to take every textual change, every substitute,
every delete, every insert, every undo,
and map those changes over to the html tag that goes with this session,
and the js variable that goes with this session?
I don't think so!
So I'm stuck with something almost as bad.
When you submit the form, or run any javascript for any reason,
all that text has to be carried into the javascript world.
This is accomplished by jSyncup() in html.c.
When js has run to completion, any changes it has made to the fields have
to be mapped back to the editor, where you can see them.
This is done by jSideEffects() in html.c.
In other words, any action that might in any way involve js
must begin with jSyncup() and end with jSideEffects().
The latter function notifies you if certain lines in the buffer have changed,
e.g. js has changed one of the input fields out from under you.
Line 357 has been updated.
Along with some cleanup, and even some modest enhancements, edbrowse
really needs a rewrite, at least with respect to browsing and javascript.
Today, it takes html text and tags, and builds
javascript objects to go along with those tags; and it builds the screen,
(I'll call it a screen, it's really a text buffer),
in parallel, at the same time.
What it should do is not build the screen at all on the first pass.
Words on the web page would simply be turned into a string
and put in as an attribute in whatever object we are working on at the time,
or the document object if no tag has appeared yet.
All internal.
Then you run the javascript, and it perhaps creates more objects,
creates lists dynamically,
builds entirely new paragraphs, images, links, whatever it does.
Then pass 3, render the screen by traversing the tree of objects
depth first.
That is your text buffer.
Now, if anything changes, through some onclick code perhaps,
you have a new tree of objects,
so Simply rerender the screen.
In our case that builds a new text buffer.
Maybe this is what standard browsers do; I don't know.
But there's something we should do that they don't.
Keep a copy of the old screen and run diff.
If one or two lines have changed,
or if a menu has changed behind the scenes, report that to the user.
It's something a sighted person would notice right away,
but I would be left in the dark.
If there are many changes, just say "entire buffer has changed".
Anyways, this is the kind of rewrite that needs to happen.
Think of this as version 4.1.
The beginning of this new approach can bee seen in rebuildSelectors() in ebjs.c.
If javascript rebuilds a dropdown list of options, that is detected,
and mapped back into html tags, and the user is notified.
See jsrt states and colors for this feature.
After this, enhance the DOM with the createElement methods.
This lets javascript create its own tables, paragraphs, lists, images, etc,
and yes, some websites do indeed do this.
https://developer.mozilla.org/en/docs/Traversing_an_HTML_table_with_JavaScript_and_DOM_Interfaces
Version 4.2.
Then, stop turning html tags into javascript objects by hand;
that is reinventing the wheel.
When you run into <P>, simply call createElement("paragraph"),
or something along those lines.
We have to implement all this in DOM JS eventually, so once this is done,
why not take advantage of it.
Version 4.3.
Just some thoughts for the future.