OpenCOBOL Forum Index OpenCOBOL
Problems with non ascii character in ascii source | Register To Post |
| Threaded | Newest First | Previous Topic | Next Topic | Bottom |
| Poster | Thread |
|---|---|
| fistons | Posted on: 2009/11/2 10:26 |
Just popping in ![]() ![]() Joined: 2009/11/2 From: Posts: 3 |
Problems with non ascii character in ascii source Hi,
I try to convert several cobol sources in C using opencobol 1.0, but several of this sources don't compile, saying, for exemple: ESSFCRP0:64: Error: syntax error, unexpected WORD, expecting EXTERNAL or GLOBAL At line 64 of the cobol source, here is what I've got: 03 FILLER PIC X(09) VALUE '!øLàà Here is the HEX equivalent: 30 33 20 20 46 49 4c 4c 45 52 20 50 49 43 20 58 28 30 39 29 20 56 41 4c 55 45 20 27 21 01 f8 4c e0 e0 00 00 00 27 2e The cobol source comes from MVS. How can I resolve this issue? Should I modify all my Cobol source with somethink like 03 FILLER PIC X(09) VALUE X'AABBEE00'. ? Or maybe there is an option in OpenCobol? Thanks |
| human | Posted on: 2009/11/2 13:09 |
Home away from home ![]() ![]() Joined: 2007/5/15 From: Posts: 967 Online |
Re: Problems with non ascii character in ascii source Concerning the hex values: Don't you need to convert the text to another character set anyway?
Where do you want to compile and run your old source? human |
| fistons | Posted on: 2009/11/2 13:29 |
Just popping in ![]() ![]() Joined: 2009/11/2 From: Posts: 3 |
Re: Problems with non ascii character in ascii source Thanks for your reply
Concerning the Hex value, I also think that, but the strange think is that the entire source seems to have 2 charset in it, a large part is in ascii and only a few contains those strange character. My old source comes from an MVS and I would like to run them under a Linux Red Hat. For my test, I use cygwin. |
| human | Posted on: 2009/11/2 14:03 |
Home away from home ![]() ![]() Joined: 2007/5/15 From: Posts: 967 Online |
Re: Problems with non ascii character in ascii source From what you've said it seems like you need to change these values anyway. Sounds like wrong converted text files. How did you got them to your PC? Maybe you can have a look at old the sources at the MVS with a terminal and check the strange characters there?
human |
| fistons | Posted on: 2009/11/2 14:15 |
Just popping in ![]() ![]() Joined: 2009/11/2 From: Posts: 3 |
Re: Problems with non ascii character in ascii source In fact, I'm the guy who transfer them from the MVS (I'm a not a Z/os person at all), so I guess I'll have to see that with the person in charge..
I also get the source in the EBCID format. Does opencobol can handle this format? And if it can, how can I do? I tryc with a simple cobc -std=mvs source.cob, it did not work. |
| human | Posted on: 2009/11/2 19:34 |
Home away from home ![]() ![]() Joined: 2007/5/15 From: Posts: 967 Online |
Re: Problems with non ascii character in ascii source -std is not used for stuff like that. But it could be interesting to implement an internal character conversion (either when the compiler recognizes the charset or if a new switch like --convert-ebcdic / --convert-ascii is used) that is used before the input files go through the parser. It's clear that there are different ebcdic charsets, no idea what would be best to use. One could use tables like this or that.
But on the second though, it is maybe better to not extend the compiler when there are a lot of possibilities to convert the files outside of it. What do you think? human |
| simrw | Posted on: 2009/11/3 8:56 |
Webmaster ![]() ![]() Joined: 2005/5/31 From: Bad Soden, Germany Posts: 776 |
Re: Problems with non ascii character in ascii source Check if you have the "iconv" utility.
(Linux systems have it, as do Solaris, AIX, HP-UX). Then, assuming you have the EBCDIC source as "PROG.ebc" - "man iconv" will show you other options. The codeset paramater is not standardized so, if it does not like "EBCDIC-US", choose from the list produced from "iconv -l". Roger |
| wmklein | Posted on: 2009/11/3 9:13 |
Home away from home ![]() ![]() Joined: 2008/12/27 From: Posts: 243 |
Re: Problems with non ascii character in ascii source I certasinly could be mistaken on this, but my guess is that the literal (on the mainframe) contains "non-displayable" cvharacters. The ISPF editor on IBM mainframes makes it very easy to enter such characters into COBOL source code as literals. These were commonly used before IBM supported hex-literals.
If so, then neither conversion when "downloading" nor conversion on a Unix/Linux system is likely to give you the results that you want. You need to check out the program logic and see what is done with such literals. For example, if they are moved to a group item that include binary sub-fields, this will NOT give desirable results. Some of the most common usages of such fields on the mainframe are for "attribute bytes" or "bit-paterns". As I say,I may be mistaken on what is happening here, but checking with those who "know" the program on the mainframe and/or tracing what happensto the field after the move statement may tell you the best way to handle such fields. |
| btiffin | Posted on: 2009/11/9 3:51 |
Home away from home ![]() ![]() Joined: 2008/6/7 From: CANADA Posts: 755 |
Re: Problems with non ascii character in ascii source Umm, not to under complicate things, but I get that very same message for any
No closing quote on the value or a period... It's a lexer's 'root' backtrack that causes this fairly obscure error message when many of the compiler's lexical scans end up in defeat due to syntax errors. I'm not sure you showed all of line 64 ... as a PIC X(9) should not dump as 40 hex bytes and there must be more to the record than the 03 filler, so I can't say that it might be that simple of a solution. Keep the weirdo binary source and just put in a closing quote? Cheers, Brian |
| wmklein | Posted on: 2009/11/9 9:49 |
Home away from home ![]() ![]() Joined: 2008/12/27 From: Posts: 243 |
Re: Problems with non ascii character in ascii source I was GUESSING that the original alhanumeric literal included the hex chracters for LF, CR, or similar sequences. These can easily be added via mainframe editors within text and mainframe source code is ALWAYS "Recrod sequential" (80 byte with 72 bytes of "data") and never "line sequentail" in nature.
It would be "nice" to see the full line (and the lines before and after it - and indicate what "non-ASCII" c haracters are there - and where they are. |
| Threaded | Newest First | Previous Topic | Next Topic | Top |
| Register To Post | |








