Montag, 27. Februar 2012

total abuse of technology



I had a (for my environment) unusual request:
After the migration of a Repository Database from 9i to latest 10g I was asked to keep a backup of the old DB for at least 3 years.
This does not sound very unusual, but it's not that simple in our environment. We do only keep backups for weeks to some month, worst case. I also cannot just backup the datafiles at all: The old database run on Solaris, but we are switching to Linux right now. With just some bad luck I would not have any system to restore (or open) this database backup at all.
This brought me to another solution; in my point of view it was not worth to write a blog about it, but I was asked by Leighton L. Nelson and so I write:

  1. export of the full database
    I run a simple export of the database. There is no expdp in 9i, so the choice was easy.

  2. compress the files
    the dump (and the logfile!) where tared together and compressed. Just to save space.

  3. prepare a proper store
    As mentioned above, there is no dedicated system for this purpose. So I had to prepare a place where the dump is safe. As a DBA, of course I know a good place to store data: A database!
    First a DBFS came to my mind. But the DB is in Version 10.2 - no DBFS.
    But it's quite simple to do the important steps manually:

    create tablespace old_dump datafile '+<DG>' autoextend on;
    create user old_dump identified by dump_old1 default tablespace old_dump;
    GRANT CONNECT, CREATE SESSION, CREATE TABLE to old_dump;
    alter user old_tech_dump quota unlimited on old_dump;

    connect old_dump/dump_old1

    create table old_dump_store
    (id integer primary key, description VARCHAR(2000), file_store BLOB)
    LOB (file_store) STORE AS SECUREFILE
    (TABLESPACE old_dump DISABLE STORAGE IN ROW NOCACHE LOGGING);


  4. insert the dump (and some metadata)

    There is a nice way in SQL Developer to load a file to a BLOB. It's just so simple.
    At last some words in the comment field are worth - so everyone knows what's inside the BLOB.
It still might sound strange to save the dump of an old database into it's descendant. But at the end: do you know a better place to store data than a database?

Donnerstag, 16. Februar 2012

restore DBMS_SCHEDULER.CREATE_CREDENTIAL cleartext password


If you want to use Oracle file watcher, you need to Create a Credential. As there a password needs to be stored in the database, Oracle tries to save it in a secure way. But as the password must be decrypted for the purpose to login on the file watchers agent side, it is not safe at all:
The credentials are stored with DBMS_SCHEDULER.CREATE_CREDENTIAL. Here an example:

exec DBMS_SCHEDULER.CREATE_CREDENTIAL(
  credential_name => 'local_credential',
  username => 'oracle',  password => 'welcome1');
exec DBMS_SCHEDULER.CREATE_CREDENTIAL(
  credential_name => 'local_credential2',
  username => 'oracle2', password => 'welcome1');


It's quite easy to see the values again:

select o.object_name credential_name, username, password
 FROM SYS.SCHEDULER$_CREDENTIAL c, DBA_OBJECTS o
 WHERE c.obj# = o.object_id;

CREDENTIAL_NAME    USERNAME PASSWORD
------------------ -------- ------------------------------------
LOCAL_CREDENTIAL   oracle   BWVYxxK0fiEGAmtiKXULyfXXgjULdvHNLg==
LOCAL_CREDENTIAL2  oracle2  BWyCCRtd8F0zAVYl44IhvVcJ2i8wNUniDQ==


At least the password is somehow encrypted, and even the password was welcome1 for both credentials, the encrypted string is not identical.

Nothing to blame here, but I mentioned, the password can be decrypted. So let's do so:

SELECT u.name CREDENTIAL_OWNER, O.NAME CREDENTIAL_NAME, C.USERNAME,
  DBMS_ISCHED.GET_CREDENTIAL_PASSWORD(O.NAME, u.name) pwd
FROM SYS.SCHEDULER$_CREDENTIAL C, SYS.OBJ$ O, SYS.USER$ U
WHERE U.USER# = O.OWNER#
  AND C.OBJ#  = O.OBJ# ;

CREDENTIAL_OWNER CREDENTIAL_NAME      USERNAME PWD
---------------- -------------------- -------- --------
SYS              LOCAL_CREDENTIAL     oracle   welcome1
SYS              LOCAL_CREDENTIAL2    oracle2  welcome1


Can you see it? It's there. Try it at home!
I don't blame anyone here. It's hard to store anything really safe in case you need to decrypt it also.
But don't expect your password save, if you store it with DBMS_SCHEDULER.CREATE_CREDENTIAL.
Maybe it's slightly to easy to use DBMS_ISCHED.GET_CREDENTIAL_PASSWORD (ok, only SYS can do so) but even it might be slightly more difficult in the future, the basic problem will still exist.

some tracing events in DBMS_SCHEDULER



I currently have the fun to review DBMS_SCHEDULER. As I'm always interested in ways to trace anything, to dig deeper in case of problems, I searched for ways to trace it.
As I did not find a collected list of events anywhere, I start them here. It's by far not a complete list, so feel free to discuss and contribute, if you want!

event 10862


resolve default queue owner to current user in enqueue/dequeue
Cause: resolve default queue owner to current user in enqueue/dequeue.
Action: turn on if client wish to resolve the default queue owner to the current user. If not turned on, the default queue owner will be resolved to the login user.
This event is not checked the way you might imagine. Just in the area of REMOVE_JOB_EMAIL_NOTIFICATION if it's 0, it's set to 1 for a call of DBMS_AQADM.REMOVE_SUBSCRIBER and set to 0 afterwards.

27401

scheduler tracing event

  • bit 0x10000 - Logging e-mail to table and returning
    bitand( ,65536)
    logs informations about sending emails into table sys.scheduler$_sent_emails
  • bit 0x20000 - start DBMS_DEBUG_JDWP.CONNECT_TCP in file watcher
    bitand( ,131072)
    starts DBMS_DEBUG_JDWP.CONNECT_TCP on localhost, port 4444
    I'm not sure if I like this event. In general I don't want any software opening connections without my knowing. And I could not find this documented anywhere.
    Is it fair to call this a backdoor?
  • bit 0x40000 - starts tracing in file watcher
    bitand( ,262144)
    logs informations about file watcher into trace file

27402

scheduler tracing event

  • bit 0x40 - starts tracing about emails
    bitand( ,64)
    similar to event 27401 bit 0x10000, but tracefile instead of table
  • bit 0x80 - starts tracing about emails
    bitand( ,128)
    logs information about email jobs into trace file
  • bit 0x100 - starts tracing in chains
    bitand( ,256)
    logs information about chains into trace file

I guess there is at least also a bit 0x200, but could not prove it right now.

27403

scheduler stop job event


I did not find anything about it yet. comments are most welcome!

Montag, 6. Februar 2012

get your traces - yourself



I'd like to mention a small peace of software. It's called MrTrace and available in Version 2.0.0.43 right now. For me it's a tool to save time. So what is it doing at all?
MrTrace is a plugin for Oracles SQL Developer to access tracefiles via SQL Developer. It's previous version could only access the tracefile for the statement you just executed. But since version 2 you anyone with the right permissions access any tracefile in the trace directory.

For a DBA it does not sound spectacular to access tracefiles, but it can be quite annoying to get and distribute tracefiles for developers. In my current company there is no OS-access to databae servers for anyone except DBAs and OS-admins. this means someone must copy over the traces to make them accessible for others. It's not a complex task, but it's disturbing.
With MrTrace I can grant anyone who knows how to use SQL Developer permissions to access trace files. So it saves a lot of time, for me and the developer. At a price of less than us$50 it should amortize in no time, if you have a diligent developer.

A list of my very private findings:

PROs:

  • it's easy to install (on the client side) 
  • it's not OS-dependent - you can apply it on any client-OS where you can start SQL Developer
  • the installation script for the database user and objects is not wrapped. So you can review and even change this part of the software. (in my case, we have a PASSWORD_VERIFY_FUNCTION enforced in my company. I needed to edit the installation script to create the user MRTRACE.
  • The support of Method R is great! I had the joy to participate in their beta program for version 2 of MrTrace. We had some nice conversations.  

CONs:

  • You need SQL Developer for the client side. - No big deal for me, but in some companies that might be a problem.
  • MrTrace needs java to do some tasks. Unfortunately there is no method to list the content of a directory, so java is needed. 
  • OS-commands like ls,  find and xargs are used. There is nothing bad about these commands, but I don't see anything they do what cannot be done in java directly. So for me it increases complexity without a need. 

And no, I am not an employee of Method R, the only relation is the software license I bought myself. 

Dienstag, 31. Januar 2012

Setting Up Oracle Connection Manager (without SOURCE_ROUTE)


This post must be seen as a direct follow up to Arup Nandas Setting Up Oracle Connection Manager.
As there are many references to this post, please read it first. Problem and Solution are quite similar, only the architecture is a little bit different:

The Architecture

 The network diagram of the three machines is slightly different:


There is a new needed connection: from the instance on dbhost1 to the connection manager on cmhost1.

After changing the setup, you will need to rewrite the TNSNAMES.ORA in the following way:

TNS_CM = 
  (DESCRIPTION = 
    (ADDRESS = 
      (PROTOCOL = TCP)(HOST = cmhost1)(PORT = 1950)
    )
    (CONNECT_DATA = 
      (SERVICE_NAME=srv1)
    )
  )

You see, the (SOURCE_ROUTE = YES) disappeared as well as the ADDRESS of the listener on dbhost1.

How it Works


Note, all the special parameters and settings on the clients TNSNAMES.ORA disappeared. But the cman must know about the SERVICE_NAME it has to serve. As the cman can be seen as a special kind of listener, there is a common way a listener gets informed about a SERVICE_NAME: the Instance has to register the services to the listener. In general this is done by pmon at registering to logal_listener and remote_listener. In this case, remote_listener is the magic parameter.

Setting Up


You can follow step (1) to (9) as in Arups blog.
But before (10) an additional step is required:

(x) on the instance add the cman to remote_listener:

Alter System Set remote_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=cmhost1)(PORT=1950))))' scope=both;

If there is already an entry in remote_listener, e.g. in a RAC, you can separate the different connection strings by comma. An example can be

Alter System Set remote_listener='SCAN-IP:1521,(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=cmhost1)(PORT=1950))))' scope=both;

(For more details about SCAN I'd recommend this PDF)

CMCTL Primer

As we have now the services registered also on cman, we can see it there. The  SHOW command has a 2nd parameter services. Here an example

Services Summary...
Proxy service "cmgw" has 1 instance(s).
  Instance "cman", status READY, has 2 handler(s) for this service...
    Handler(s):
      "cmgw001" established:1 refused:0 current:0 max:256 state:ready
         <machine: 127.0.0.1, pid: 16786 >
         (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=44391))
      "cmgw000" established:1 refused:0 current:0 max:256 state:ready
         <machine: 127.0.0.1,pid: 16784>
         (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=44390))
Service "INSTANCE1" has 1 instance(s).
  Instance "INSTANCE1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dbhost1)(PORT=1521))
Service "cmon" has 1 instance(s).
  Instance "cman", status READY, has 1 handler(s) for this service...
    Handler(s):
      "cmon" established:3 refused:0 current:1 max:4 state:ready
         <machine: 127.0.0.1, pid: 16759>
         (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=44374))
The command completed successfully.

Fine Tuning

I try to create a dedicated service for all (or a well known set of) connections via the connection manager. By doing so it's sometimes easier to separate or identify different kinds of sessions.

Freitag, 9. Dezember 2011

Who created that process?

Figure 2-7
Connection to
a Dedicated
Server Process
For some reason I was really curios who created that process. It's not about a particular process in detail, mir a well known kind of processes. At least well known for DBAs.
Which process? 
It's one of these:

oracle   13096     1  0 20:05 ?        00:00:00 oracleTTT071 (LOCAL=NO)

Yes, it's a simple server process, nothing spectacular. Nevertheless, the Concepts guide is not very specific, who created that process. So I tried to find out in more detail.
On my linux sandbox the first column of ps -ef shows the UID, the second is the PID, the third is the PPID. Unfortunately it's 1 here, and I'm quite sure, this process was not created by init. So this proces is somewhat orphaned, as the direct parent disappeared. Very sad!
I decided to follow Figure 2-7 from the concepts guide. I used strace -f -p <PID_of_listener> to see what's going on. -f follows all forks, so also their actions are traced.
The first 3 lines are
Process 2979 attached with 3 threads - interrupt to quit
[pid  2981] futex(0xae8dee4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  2980] restart_syscall(<... resuming interrupted call ...> <unfinished ...>


So we have 3 listener processes - it's good to know and probably worth to investigating this segregation of duties - but not in this post. There are so many interesting lines, but I'm searching for a process, so let's continue with

[pid  2979] clone(Process 27028 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2aedd9914b80) = 27028
[pid  2979] wait4(27028, Process 2979 suspended
 <unfinished ...>
[pid 27028] clone(Process 27029 attached (waiting for parent)
Process 27029 resumed (parent 27028 ready)
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2aedd9914b80) = 27029
[pid 27028] exit_group(0)               = ?
Process 2979 resumed
Process 27028 detached
[pid  2979] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27028
[pid 27029] close(15 <unfinished ...>
[pid  2979] --- SIGCHLD (Child exited) @ 0 (0) ---
[pid 27029] <... close resumed> )       = 0
[pid  2979] close(14 <unfinished ...>
[pid 27029] close(16 <unfinished ...>
[pid  2979] <... close resumed> )       = 0
[pid 27029] <... close resumed> )       = 0
[pid  2979] close(17)                   = 0


Here the listener ([pid  2979]) creates a new process by the first clone call. This new Process has the PID 27028. This new process has only one purpose: again clone a new Process: PID 27029 and use exit_group(0) to terminate directly afterwards. By this trick the listener is not shown as parent process for PID 27029. Directly after it's creation PID 27029 closes some file handles. As by the sequence of clone calls the new process inherited a table of all open file (and network) handles it seems it tries to get rid of any it does not need as early as possible. The next part
[pid  2979] fcntl(16, F_SETFD, FD_CLOEXEC) = 0
[pid 27029] setsid( <unfinished ...>
[pid  2979] fcntl(15, F_SETFD, FD_CLOEXEC <unfinished ...>
[pid 27029] <... setsid resumed> )      = 27029
[pid  2979] <... fcntl resumed> )       = 0
[pid 27029] geteuid()                   = 5831
[pid  2979] fcntl(13, F_SETFD, FD_CLOEXEC) = 0
[pid 27029] setsid()                    = -1 EPERM (Operation not permitted)
[pid  2979] poll([{fd=8, events=POLLIN|POLLRDNORM}, {fd=11, events=POLLIN|POLLRDNORM}, {fd=12, events=POLLIN|POLLRDNORM}, {fd=16, events=POLLIN|POLLRDNORM}, {fd=15, events=0}], 5, -1 <unfinished ...>

makes sure the file descriptos 16, 15 and 13 will remain after an execve(2) call.
And here it goes:
[pid 27029] execve("/appl/oracle/product/rdbms_112022_a/bin/oracle", ["oracleTTT051", "(LOCAL=NO)"], [/* 109 vars */]) = 0
from the man page if execve:
execve() executes the program pointed to by filename.
...
execve() does not return on success, and the text, data, bss, and stack of the calling process are overwritten by that of  the  program  loaded.   The  program invoked inherits the calling process’s PID, and any open file descriptors that are not set to close-on-exec.  Signals pending on the calling process are cleared.  Any signals set to be caught by the calling process are reset  to  their default behaviour.  The SIGCHLD signal (when set to SIG_IGN) may or may not be reset to SIG_DFL.
       If the current program is being ptraced, a SIGTRAP is sent to it after a successful execve().
       If  the  set-user-ID  bit  is set on the program file pointed to by filename, and the calling process is not being ptraced, then the effective user ID of the calling process is changed to that of the owner of the program file.  i Similarly,  when  the  set-group-ID bit of the program file is set the effective group ID of the calling process is set to the group of the program file.
From that point on there you can see how the server process comes to life. It's very interesting in some details, but not scope of this post. After some conversation between listener and server process using file descriptors 15 and 16 (I assume these are just sockets) both close these file descriptors. The listener also closes file descriptor 13 which seems to be the TCP connection to the client. From that point the 2 processes seems to be independent.

Well, now I know (at least on my test-system) the simplest way, the listener creates the process - and it uses execve to do so. There still are many questions open, like what's going on at this redirection as shown in Figure 2-8.

Montag, 28. November 2011

bzip2 twice?


To check the performance of RMAN backup I recently started to trace it a little bit. As most of the time was not spent in any reading from disk or writing to media manager library event, it was on CPU. It's good to know the CPUs are of any good, but as I still want to know what's going on I tried to dig any deeper. CPU circles are not just a magic black box where we put in a problem and the answer comes out after some times. At an abstraction layer it's a chain of functions where one is called by another, and only the last is the one doing anything. There is not much information in that fact per se, but developers are humans also, and they are giving the functions they code meaningful names.


So I had just to find these names (and where most of the time is spent) to figure out what's going on. To save my time I remembered Tanel Poders Advanced Oracle Troubleshooting Guide, Part 9 – Process stack profiling from sqlplus using OStackProf. There he described his tool ostackprof. This did all the job for me, I just had to find a rman session.

Here's the shortstack where most of the time was spent:
(This backup was done with COMPRESSION ALGORITHM ‘BASIC’)
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kporpc()->kkxrpc()->prient()->prient2()->pricbr()->pricar()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_ICAL()->pevm_icd_call_common()->krbibpc()->krbbpc()->krbb3crw()->krbbcdo()->kgccdo()->kgccbz2pseudodo()->kgccbz2do()->kgccm()->kgccbuf()->kgccgmtf()->__sighandler()->->

The naming convention for functions is not public documented by oracle, but for some reasons I'm sure functions starting with krb are related to backup, whereas kgcc is used for compression. Especially the working function kgccgmtf reads like generate Move To Front.

At that point I had a lot more information than before, still I had no way how to improve the backup speed. As we have licensed advanced compression for that particular node, we tested with different other compression methods. LOW and MEDIUM where faster, with less compression than our previous BASIC. But HIGH was even slower!

So again I used ostackprof and that's the topmost stack trace - for HIGH:
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kporpc()->kkxrpc()->prient()->prient2()->pricbr()->pricar()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_ICAL()->pevm_icd_call_common()->krbibpc()->krbbpc()->krbb3crw()->krbbcdo()->kgccdo()->__PGOSF209_kgccbzip2pseudodo()->kgccbzip2do()->BZ2_bzCompress()->handle_compress()->BZ2_compressBlock()->generateMTFValues()->__sighandler()->->


Do you see the difference? Until kgccdo there is no! And even afterwards, the functions are somewhat similar. One more thing is worth to mention: the bzip2 implementation for HIGH does not use oracle internal naming convention. So it's worth to search for these names on the internet. one of my best hits was a compress.c File Reference.

Did Oracle reinvent the wheel? No. For me it looks as if they tried their best first (by doing their own kgcc implementation) and afterwards preferred simple copy&paste. Maybe they should just skip either of these 2 - they still can use parameters to achieve different compression quality. 


If someone is interested in our results:
for a single datafile of 30GB (with 100% usage) we achieved on a production system - with all it ongoing tasks:


Typeminbackup-size
BASIC13:325.8
LOW5:178
MEDIUM8:526.14
HIGH65:294.25

We decided to choose MEDIUM.